/ Acoustic / Fundamentals / Production / Tutorials / Software / Home

5.2.1 Word Models: Initialization

The goal of training the acoustic models is to converge on a solution that yields the most likely sequence of vectors for a given acoustic unit. Setting the model parameters to some initial values before training is known as initialization and can facilitate converging on a solution more quickly. In the figure to the right, the block labeled Seed Model Construction represents the initialization phase.

Several techniques can be used to initialize the models. One simple but effective technique is to compute the global mean and variance from the training data and set the model parameters to these values. This technique is referred to as flat-start. Knowing the mean, which is the average value of the data, and the variance, which is a measure of how spread out the data is from the average, helps in determining which solution provides the most likely sequence of vectors. Other more complex techniques include the use of clustering, which involves gathering statistics from certain regions of the training data, and decision trees, which involves asking hierarchical true or false questions to classify an elements of the training data.

To flat-start the word models go to the directory:

$ISIP_TUTORIAL/sections/s05/s05_02_p01/

This directory already contains the parameter file that will be used for this example. The initial language and acoustic models are in the directory $ISIP_TUTORIAL/models/. Run the following command:

isip_recognize -param params_init.sof -verbose brief -list $ISIP_TUTORIA./databases/lists/identifiers_train.sof

Expected Output:

    Command: isip_recognize -parameter_file params_init.sof -verbose brief -list ../../../databases/lists/identifiers_train.sof
Version: 1.23 (not released) 2003/05/21 23:10:45
  
  loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db.sof
  
  *** no symbol graph database file was specified ***
  
  *** no transcription database file was specified ***
  
  loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
  
  loading language model: $ISIP_TUTORIAL/models/lm_word_digraph.sof
  
  loading statistical model pool: $ISIP_TUTORIAL/models/smp_word.sof
  
  *** no configuration file was specified ***
  
  processing file 1 (ae_12a): $ISIP_TUTORIA./databases/sof_8k/train/ae_12a.sof
  
  processing file 2 (ae_1a): $ISIP_TUTORIA./databases/sof_8k/train/ae_1a.sof
  
  processing file 3 (ae_2789385a): $ISIP_TUTORIA./databases/sof_8k/train/ae_2789385a.sof

  .....

After the flat-start process finishes, the acoustic information within this initial acoustic model will contain the global mean and variance from the training data. Open the parameter file, params_init.sof. Notice the following parameters:

The INITIALIZE algorithm and GLOBAL implementation are unique to the flat start process. Obviously, the INITIALIZE algorithm tells the recignizer that we will be initializing the acoustic models. The GLOBAL implemenation tells the recognizer to compute the global means and variances from the training data. The variance floor parameter specifies which values will not contribute to the global means and variances. In this case, any value less than 0.0002 will not contribute. The location and filename of the updated acoustic and language models are also indicated in the parameter file by the parameters update_language_model and update_statistical_model_pool.

Once the models are initialized, the reestimation process begins. Continue to the next section to begin reestimation.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home