Skip to content

API Documentation

This page contains documentation of the API to the THF-Micro™ library.

Error codes that are commonly encountered with the API are listed below in detail. Refer to section 'Error Codes' for the complete list of error codes.

SensoryInfo

errors_t SensoryInfo(infoStruct_T* isp);
Description: Populates isp->version with the THF-Micro™ library version number.
Parameters: Pointer to an existing infoStruct.
Returns: Always returns ERR_OK.
Comments: Use of this function is optional in an application.

SensoryInfo Example
infoStruct_T isp;
SensoryInfo(&isp);
u32 major = (isp.version >> 20) & 0x00000fff;
u32 minor = (isp.version >> 12) & 0x000000ff;
u32 point = isp.version & 0x00000fff;
printf("THF-Micro Version = %d.%d.%d\n", major, minor, point);

SensoryLibraryLicenseInfo

BOOL SensoryLibraryLicenseInfo(unsigned* seconds, unsigned* events);
Description: Get license limits for the THF-Micro™ library.
Parameters:

  • seconds: Pointer to register to store time limit for continuous speech recognition.
  • events: Pointer to register to store event limit for continuous speech recognition.

Returns: TRUE if the THF-Micro™ library has a valid license and FALSE if it does not.
Comments: If seconds and events are 0, the library has no limits on continuous speech recognition.
Notes: If the THF-Micro™ library does not have a valid license, SensoryProcessInit should fail with error code ERR_LICENSE. Recognition will not run in this case.

SensoryModelLicenseInfo

BOOL SensoryModelLicenseInfo(t2siStruct* t, unsigned* seconds, unsigned* events);
Description: Get license limits for the model.
Parameters:

  • t: Pointer to t2siStruct.
  • seconds: Pointer to store time limit for continuous speech recognition.
  • events: Pointer to store event limit for continuous speech recognition.

Returns: TRUE if the model has a valid license and FALSE if does not.
Comments: If seconds and events return 0, the model has no limits on continuous speech recognition.
Notes:

  • If the model does not have a valid license, SensoryProcessInit should fail with error code ERR_LICENSE. Recognition will not run in this case.
  • Call after t2siStruct has been initialized with net and grammar.

SensoryAlloc

errors_t SensoryAlloc(t2siStruct* t, unsigned int* size);
Description: Calculates the SPP size needed for speech recognition.
Parameters:

  • t: Pointer to t2siStruct.
  • size: Pointer to return SPP size needed.
  • Returns: Should return ERR_OK if successful. Other codes may indicate bad net or grammar.

Comments: Unit for size is bytes. SensoryAlloc stores the size in t->size.
Notes:

  • Before the call to SensoryAlloc, t->net and t->gram must point to the net and grammar data, respectively.
  • It may be useful to experiment with t->maxTokens. Application developers can use t->maxTokensUsed and t->tokensPruned during development to determine if a higher or lower number than the default MAX_TOKENS is needed.
  • If t->outOfMemory or t->tokensPruned are TRUE during the recognition process, then the search was limited by the number of search tokens. Increase t->maxTokens in this case.
  • Optionally, any of the other t2siStruct input fields can be customized. If they are zero, then recognizer will use default values. It is a good practice to zero the entire t2siStruct at the start of the application. If not, then all input fields need to be initialized before calling SensoryAlloc.

SensoryAlloc Example
t2siStruct app;
t2siStruct *t = &app;
memset(t, 0, sizeof(t2siStruct));
unsigned int size;

t->net = (intptr_t) NET_ADDR;
t->gram = (intptr_t) GRAM_ADDR;
errors_t error = SensoryAlloc(t, &size);
if (error) {
    printf("SensoryAlloc failed with error 0x%x\n", error);
    panic();
} 
t->spp = (void*)malloc(size);

SensoryAllocMulti

errors_t SensoryAllocMulti(t2siStruct* t, unsigned int* size, int channels, int depth);
Description: Calculates the size of SPP needed for speech recognition, for (C or D) > 1.
Parameters:

  • t: Pointer to t2siStruct.
  • size: Pointer to return SPP size needed.
  • channels: Number of channels to be processed at once.
  • depth: Number of frames in one channel to be processed at once.

Returns: Should return ERR_OK if successful. Other returned codes may indicate bad net or grammar.
Comments: After the call to SensoryAllocMulti, (channels * depth) number of frames will be processed in one call to SensoryProcessMultiData.
Notes:

SensoryAllocMulti Example
t2siStruct app;
t2siStruct *t = &app;
memset(t, 0, sizeof(t2siStruct));
unsigned int size;

t->gram = (intptr_t) GRAM_ADDR;
t->net = (intptr_t) NET_ADDR;
errors_t error = SensoryAllocMulti(t, &size, 1, 2); // One channel, two frames at once
if (error) {
    printf("SensoryAllocMulti failed with error 0x%x\n", error);
    panic();
} 
t->spp = (void*)malloc(size);

SensoryProcessInit

errors_t SensoryProcessInit(t2siStruct* t);
Description: Initializes SPP for speech recognition.
Parameters: Pointer to t2siStruct.
Returns: Should return ERR_OK if successful.
Comments: Calling this function is required whenever the net or grammar changes.
Notes:

  • Before the call to SensoryProcessInit, t->spp must contain a pointer to the SPP. In other words, SensoryAlloc must be called successfully beforehand.
  • In operation, SensoryProcessInit should always return ERR_OK. Below are some other commonly encountered error codes that must be corrected before speech recognition can be performed.
  • ERR_LICENSE means that the THF-Micro™ library does not have a valid license.
  • ERR_T2SI_PSTORE means that t->spp is NULL.
  • ERR_T2SI_NN_BAD_VERSION means that t->net is corrupted or does not point to a valid net file.
  • ERR_T2SI_BAD_VERSION means that t->gram is corrupted or does not point to a valid grammar file.
  • Potential reasons for encountering 'invalid' model files: outdated models that are no longer supported, incompatbile target formats, etc.
  • ERR_T2SI_BAD_SETUP means that t->net is NULL and/or t->gram is NULL.
  • ERR_T2SI_NN_MISMATCH means that the net and grammar are not paired. These two files are generated together and they must be used together; it is not appropriate to pair any net with any grammar.

SensoryProcessInit Example
// t->net, t->gram, t->spp already set by user
if (t->spp == NULL)
{
    printf("No memory left for SPP\n");
    panic();
}

errors_t error = SensoryProcessInit(t);
if (error) {
    printf("SensoryProcessInit failed with error 0x%x\n", error);
    panic();
}

SensoryProcessData

RecoResult* SensoryProcessData(t2siStruct *t, SAMPLE *brick);
Description: Processes one brick of audio samples.
Parameters:

  • t: Pointer to t2siStruct.
  • brick: Pointer to brick of audio samples to process.

Returns: Pointer to a RecoResult structure, containing information about recognition results for the processed brick.
Comments: SensoryProcessDatais called once every 15 msec, as each new brick of data becomes available; it is called repeatedly until recognition success or failure.
Notes:

  • When a recognition occurs, the error field of the RecoResult structure should be ERR_OK. Below are some other commonly encountered error codes:
  • ERR_NOT_FINISHED means that recognition process is still going.
  • ERR_RECOG_FAIL means that recognition failed. Occurs only with non-spotted vocabularies.
  • ERR_RECOG_LOW_CONF means that the recognizer found a potential, but it is doubtful (low-confidence). Occurs only with non-spotted vocabularies.
  • ERR_RECOG_LOW_CONF means that the recognizer found a potential recognition, but it is a 'maybe' (mid-confidence). Occurs only with non-spotted vocabularies.
  • ERR_DATACOL_TIMEOUT means no recognition occurred before timeout. Occurs only when t->timeOut has been specified.
  • ERR_T2SI_TOO_MANY_RESULTS means t->maxResults is too small. Increase the value of t->maxResults.
  • ERR_NULL_POINTER means t is NULL.

SensoryProcessData Example
// File-based audio input example
// In an actual application, real-time audio input is captured on-device
const char* audioFile = "audio.wav";
FILE* file = fopen(audioFile, "rb");
if (file == NULL) {
    printf("Cannot open audio file '%s'\n", audioFile);
}

// Keep calling SensoryProcessData while THF-Micro is getting audio frames 
do { 
    s16 brick[BRICK_SIZE_SAMPLES];
    fread(brick, sizeof(brick), 1, file);

    RecoResult *r = SensoryProcessData(t, &brick[0]);
    if (r->error == ERR_NOT_FINISHED) {
        continue; // THF-Micro processing ongoing, but no recognition 
    }
    if (r->error == ERR_OK) { // THF-Micro recognized a phrase
        printf("Recognized wordID = %d", r->wordID);
    } else {
        printf("SensoryProcessData failed with error 0x%lx\n", r->error);
        panic();
    }
}
while (!feof(file));

SensoryProcessDataSamples

RecoResult* SensoryProcessDataSamples(t2siStruct* t, SAMPLE* samples, int count);
Description: Processes smaller bricks of audio than standard, such as 5 or 10 msec.
Parameters:

  • t: Pointer to t2siStruct.
  • brick: Pointer to brick of audio samples to process.
  • count: Number of samples in a brick (must be less than 240).

Returns: Pointer to a RecoResult structure, containing information about recognition results for the processed brick.
Comments: Works just like SensoryProcessData, but takes smaller sized bricks of audio than standard.
Notes:

  • The audio buffer size must be a multiple both of 240 and count.
  • Don’t use variable-size blocks.
  • Only works if channels and depth are both 1.

SensoryProcessMultiData

errors_t SensoryProcessMultiData(t2siStruct* t, SAMPLE** samples);
Description: Processes multiple frames at once, saving memory access overhead.
Parameters:

  • t: Pointer to t2siStruct.
  • samples: Array of pointers to audio samples in each channel.

Returns: Should return ERR_OK when recognition occurs or ERR_NOT_FINISHED when recognition process is still going. Refer to notes about other error codes in the section about SensoryProcessData.
Comments: See SensoryGetResult(channel, depth) below for getting the recognition results for each frame in each channel.
Notes:

  • This function takes an array of pointers to audio samples; samples[0] is for the first channel, samples[1] for the second channel, and so on.
  • Each pointer must point to depth D frames worth of samples, that is, (D * 240) samples.

SensoryGetResult

RecoResult* SensoryGetResult(t2siStruct* t, int channel, int depth);
Description: Get the results for the brick at (channel, depth), produced by SensoryProcessMultiData.
Parameters:

  • t: Pointer to t2siStruct.
  • channel: Index of channel
  • depth: Index of depth

Returns: Pointer to a RecoResult structure, containing information about recognition results for the brick at (channel, depth).
Comments: Indices are 0-based. The 'older' results come first (depth = 0 is the 'oldest', depth = (D - 1) is the 'newest').

SensoryProcessRestart

errors_t SensoryProcessRestart(t2siStruct* t, int channel);
Description: Re-initializes SPP for recognition on a given channel.
Parameters:

  • t: Pointer to t2siStruct.
  • channel: Index of channel

Returns: Always returns ERR_OK.
Comments: Now optional after recognition success or error.
Notes:

  • If the net and grammar have not changed, then SensoryProcessRestart can be used to restart recognition, and is faster than the full initialization done by SensoryProcessInit.
  • One call must have been made to SensoryProcessInit before any call to SensoryProcessRestart.
  • SensoryProcessRestart does not check to see that the net and grammar have not changed; the application must guarantee that.
  • SensoryProcessRestart does no error checking to ensure that the t2siStruct is still properly initialized.

SensoryFeatureCompatible

BOOL SensoryFeatureCompatible(t2siStruct* src, t2siStruct* dst);
Description: Checks two t2siStruct for feature compatibility.
Parameters: Pointers to two initialized t2siStruct.
Returns: TRUE if features from src can be used for dst or FALSE otherwise.
Comments: Used in case of multiple recognizers on the same audio stream.
Notes: The THF-Micro™ library can be built without this API, upon request, for small savings in code size.

SensoryConnectFeatures

errors_t SensoryConnectFeatures(t2siStruct* src, t2siStruct* dst);
Description: Sets up dst to process features from src.
Parameters: Pointers to two initialized t2siStruct.
Returns: Should return ERR_OK if connected successfully or ERR_T2SI_FEATURE_MISMATCH otherwise.
Comments : Used in case of multiple recognizers on the same audio stream.
Notes: The THF-Micro™ library can be built without this API, upon request, for small savings in code size.

SensoryProcessFeatures

RecoResult* SensoryProcessFeatures(t2siStruct* dst);
Description: dst will process features from the feature source from SensoryConnectFeatures.
Parameters: Pointer to initialized t2siStruct.
Returns: Pointer to a RecoResult structure, containing information about recognition results for the processed brick.
Comments: Used in case of multiple recognizers on the same audio stream.
Notes:

  • The feature source must have had SensoryProcessData called on it.
  • The THF-Micro™ library can be built without this API, upon request, for small savings in code size.

SensoryAudioRewind

errors_t SensoryAudioRewind(t2siStruct* t, int rewind);
Description: Rewinds the audio input pointer.
Parameters:

  • t: Pointer to t2siStruct.
  • rewind: Number of milliseconds to rewind.

Returns: Should return ERR_OK if rewind is successful. If rewind is not successful, another error code will be returned.
Comments: Used after wakeword recognition for wake-to-command use-cases.

SensoryAudioFastForward

errors_t SensoryAudioFastForward(t2siStruct* t);
Description: Fast forwards a rewound audio input pointer to the current frame.
Parameters: Pointer to t2siStruct.
Returns: Should return ERR_OK if fast forward is successful. If fast forward is not successful, another error code will be returned.
Comments: Used after command recognition for wake-to-command use-cases.