5.3.1 CI Phone Models: Initialization The process of training context independent phone models is almost identical to the process of training word models. The phone models we will train in this section will later be used to produce context dependent phones. This process is covered in Section 5.4 and Section 5.5. In fact, most complex acoustic models are derived from phone models. The main difference between word models and phone models is language model structure. For word models, we had two levels in our language model: the word level and state level. For phone models, a phone level is inserted between the word and state level for a total of three levels. This concept is explained in more detail in Section 4.2.4. To begin, we must again initialize the models. For a review of the initialization process, revisit Section 5.2.1. Go to the directory:
Command: isip_recognize -parameter_file params_init.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_train.sof -verbose brief Version: 1.23 (not released) 2003/05/21 23:10:45 loading audio database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_audio_db.sof *** no symbol graph database file was specified *** *** no transcription database file was specified *** loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof loading language model: $ISIP_TUTORIAL/models/ci_phone_models/lm_phone_digraph.sof loading statistical model pool: $ISIP_TUTORIAL/models/ci_phone_models/smp_phone.sof *** no configuration file was specified *** processing file 1 (ae_12a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_12a.sof processing file 2 (ae_1a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_1a.sof processing file 3 (ae_2789385a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_2789385a.sof processing file 4 (ae_2b): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_2b.sof processing file 5 (ae_3a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_3a.sofThe parameter file for this initialization, params_init.sof, is very similar to the parameter file for the initialization of the word. Again, the global mean and variances are calculated across all the features in the database and the acoustic models are seeded with these values. The different states in the statistical model pool files correspond to the different phones in the language instead of the actual words of the language. Once the initialization process is complete, the reestimation process begins. |