- 41.8% WER at 60xRT on the dev-test from JHU WS'97: competitive with
comparable commercial systems
- Front-End: Mel Cepstral (12) + Energy + Delta + Acceleration
60+ hrs training, cepstral mean normalization
- Models: 3-state left-to-right HMMs with dedicated silence and
inter-word silence models, 40 phones, cross-word context-dependent
triphones
- Training schedule: Baum-Welch training
- Flat-start
- Monophone Training
- Triphone creation
- State-tying
- Mixture Generation
- Decoding: time-synchronous decoding
- trigram, word-internal lattice generation
- trigram, cross-word lattice rescoring