/ Acoustic / Fundamentals / Production / Tutorials / Software / Home

5.3.1 CI Phone Models: Initialization

The process of training context independent phone models is almost identical to the process of training word models. The phone models we will train in this section will later be used to produce context dependent phones. This process is covered in Section 5.4 and Section 5.5. In fact, most complex acoustic models are derived from phone models.

The main difference between word models and phone models is language model structure. For word models, we had two levels in our language model: the word level and state level. For phone models, a phone level is inserted between the word and state level for a total of three levels. This concept is explained in more detail in Section 4.2.4.

To begin, we must again initialize the models. For a review of the initialization process, revisit Section 5.2.1. Go to the directory:

$ISIP_TUTORIAL/sections/s05/s05_03_p01/

From this directory, run the following command:

isip_recognize -param params_init.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_train.sof -verbose brief

Expected output:

Command: isip_recognize -parameter_file params_init.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_train.sof -verbose brief
Version: 1.23 (not released) 2003/05/21 23:10:45
  
  loading audio database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_audio_db.sof
  
  *** no symbol graph database file was specified ***
  
  *** no transcription database file was specified ***
  
  loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
  
  loading language model: $ISIP_TUTORIAL/models/ci_phone_models/lm_phone_digraph.sof
  
  loading statistical model pool: $ISIP_TUTORIAL/models/ci_phone_models/smp_phone.sof
  
  *** no configuration file was specified ***
  
  processing file 1 (ae_12a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_12a.sof
  
  processing file 2 (ae_1a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_1a.sof
  
  processing file 3 (ae_2789385a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_2789385a.sof
  
  processing file 4 (ae_2b): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_2b.sof
  
  processing file 5 (ae_3a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_3a.sof

The parameter file for this initialization, params_init.sof, is very similar to the parameter file for the initialization of the word. Again, the global mean and variances are calculated across all the features in the database and the acoustic models are seeded with these values. The different states in the statistical model pool files correspond to the different phones in the language instead of the actual words of the language. Once the initialization process is complete, the reestimation process begins.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home