/ Recognition / Fundamentals / Production / Tutorials / Software / Home

4.6.1 Word Graph Generation

A word graph, or lattice as it is sometimes referred to, is a point-to-point description of the search through the language model. The generated word graph can be applied in a second pass decoding, usually referred to as lattice rescoring, to limit the search space. As well, it can be reused to effeciently test new acoustic models or language models in a less resource demanding manner thus making the task more computer feasible.

Again, we will start by decoding a single utterance. We will use the same utterance from the previous sections as well as the parameter file from Section 4.2.3. Instead of outputting one hypothesis and score or N-Best, we will instead create a lattice of all the possibilities. First, we have to add to our parameter file (params_decode.sof) the Word Graph Generation option. We will do this by changing the "DECODE" option of "Algorithm" to "GRAPH_GENERATION".

@ Sof v1.0 @
@ HiddenMarkovModel 0 @
algorithm = "GRAPH_GENERATION";
implementation = "VITERBI";
num_levels = 3;
output_mode = "FILE";
output_file = "hypo.out";
frontend_file = "frontend.sof";
audio_database_file = "audio.sof";
language_model_file = "lm_model_update.sof";
acoustic_model_file = "ac_model_update.sof";

Go to the directory:

$ISIP_TUTORIAL/sections/s04_06_p01/

and run the following command:

isip_recognize -parameter_file params_decode.sof -list $ISIP_TUTORIA./databases/lists/identifiers_test.sof -verbose ALL

This will produce the following output:

Command: isip_recognize -parameter_file params_decode.sof $ISIP_TUTORIA./databases/lists/identifiers_test.sof -verbose ALL
Version: 1.16 (not released) 2002/09/25 00:20:53
  
loading front-end: ../../recipes/frontend.sof
loading language model: ../../models/lm_model_update.sof
loading acoustic model: ../../models/ac_model_update.sof
loading audio database: ./audio_db.sof
opening the output file: ./hypo.out
  
processing file 1 (st_9z59362a): ../../features/st_9z59362a.sof
    
ref:    
hyp[1]:    NINE ZERO FIVE NINE THREE SIX TWO 
score[1]:  -18749.466796875   frames: 294

processed 1 file(s) successfully, attempted 1 file(s), 294 frame(s)

Similar to N-Best, Word Graphs are another form of output of isip_recognize. Word Graphs represent the search paths point by point through the language model. They can be extremely useful in effeciently rescoring experiments with less strain on computer resources. Bi-Gram and Tri-gram are part of the N-Gram language models. Bi-Gram compares two words to each other while Tri-Gram compares three. The more words that are compared, the higher the likelihood of a correct hypothesis. However, considerably more resources are used to calculate Tri-Grams.

For example, say you had a data set of 10 words. If you were using Bi-Gram then you would be comparing 100 words (10 to the power of 2). If you used Tri-Gram then 1000 words (10 to the power of 3) would be compared. Obviously it is still computer feasible to use tri-gram, but what if your data set had 400,000 words? In this case, Word Graphs are used. By creating a lattice (or graph) you can trim off ("prune") the obvious incorrect paths and limit your search. Then, the model can be re-scored using a Tri-gram model. The result will have a better likelyhood and the utterance will take less time to decode.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home