Adapting wake word¶
These are fixed wake word models that continuously adapt to speakers' voices to improve false-accept rates. They are drop-in replacements for fixed wake words.
Continuously adapting wake word models have task-type==phrasespot and filenames that by convention match ca-*.snsr
Adapting wake word models included in this distribution.
Operation¶
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process
result(^result)
adaptStarted(^adapt-started)
adapted(^adapted)
newUser(^new-user)
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|recognize| result
process -->|recognize w/ high SNR| adaptStarted
adaptStarted --> result
result --> fetch
fetch -->|adapted| adapted
adapted --> fetch
adapted -->|new user identified| newUser
newUser --> fetch - Read audio data from ->audio-pcm.
- Invoke ^sample-count.
- Invoke ^adapt-started if processing detects a vocabulary phrase in a low-noise environment. This starts adapting the model to the speaker's voice on a background thread.
- Invoke ^result if processing detects a vocabulary phrase.
- Invoke ^adapted when the background thread has finished adding an enrollment.
- Invoke ^new-user if adaptation detects a user it hasn't seen before.
- Continue processing until STREAM_END occurs on ->audio-pcm, or one of the event handlers returns a code other than OK.
Register callback handlers with setHandler only for those events you're interested in.
Settings¶
^adapt-started, ^adapted, ^new-user, ^result, ^sample-count
operating-point-iterator, user-iterator, vocab-iterator
audio-stream, audio-stream-first, audio-stream-last
->audio-pcm, audio-stream-from, audio-stream-to, delete-user, dsp-acmodel-stream, dsp-header-stream, dsp-search-stream, rename-user
audio-stream-size, cache-file, delay, dsp-target, duration-ms, listen-window, low-fr-operating-point, operating-point, samples-per-second, sv-threshold
live-spot.c, snsr-eval.c, PhraseSpot.java, segmentSpottedAudio.java