/ Recognition / Fundamentals / Production / Tutorials / Software / Home
4.2.5 Network Decoding: Cross-Word Context-Dependent Phones:
Section 4.2.5: Cross-Word Context-Dependent Phone

Another type of phone is the cross-word context-dependent phone. The word-internal context-dependent phones in the last section consider the surrounding context of a phone within a word, but these phones do not consider the surrounding context across the boundaries of the word. In the last section, the beginning and ending phone of a word was a di-phone since no context preceded or followed. In this section, we will look at cross-word context-dependent phones which examine the context across the boundaries of a word within a sentence.

The experiment below decodes a single utterance using cross-word context-dependent phones. It might take several minutes to load the acoustic models, so please be patient. Go to the directory $ISIP_TUTORIAL/sections/s04/s04_02_p05/

    cd $ISIP_TUTORIAL/sections/s04/s04_02_p05/
and run the following command:

    isip_recognize -parameter_file params_decode.sof -list $ISIP_TUTORIAL/sections/s04/s04_02_p05 -verbose ALL
This will produce the following output:
    Command: isip_recognize -parameter_file params_decode.sof -list /ftp/pu./projects/speech/software/tutorials/production/
    fundamentals/current/example./databases/lists/identifiers_test.sof -verbose ALL Version: 1.23 (not released) 2003/05/21 23:10:45 loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db_test.sof *** no symbol graph database file was specified *** *** no transcription database file was specified *** loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof loading language model: $ISIP_TUTORIAL/models/xword_phone_models/compare/lm_xword_jsgf_8mix.sof loading statistical model pool: $ISIP_TUTORIAL/models/xword_phone_models/compare/smp_xword_8mix.sof *** no configuration file was specified *** opening the output file: $ISIP_TUTORIAL/sections/s04/s04_02_p05/results.out processing file 1 (ah_111a): $ISIP_TUTORIA./databases/sof_8k/test/ah_111a.sof hyp: ONE ONE ONE score: -9122.6484375 frames: 138 processing file 2 (ah_1a): $ISIP_TUTORIA./databases/sof_8k/test/ah_1a.sof hyp: ONE score: -5187.28173828125 frames: 79 .....
As you probably noticed, this experiment took a lot longer than some of the previous experiments. Most of that extra time was spent loading the acoustic models. The acoustic model file for this experiment is quite large since the number of phones has increased. In practical experiments, a large list of utterances will be decoded. Each utterance will use the same acoustic models, and the models won't have to load before each utterance.

Cross-word context-dependent phones are different from word-internal phones because they consider surrounding words as well as surrounding phones. The image to the right illustrates a cross-word context-dependent language model. This language model has only one word, zero, with two pronounciations. As you can see in the phone level, all of the phones are triphones. Since there is just one word in this model, the only sequence possibilities are a zero followed by another zero, or a zero followed by silence. The first and last triphone of the word consider this possiblility. For a language model consisting of many words, the phone level becomes extremely complex.


   
Table of Contents   Section Contents   Previous Page Up Next Page
      Glossary / Help / Support / Site Map / Contact Us / ISIP Home