6.1.1 Overview
You have now seen that building a speech recognizer requires knowledge
from many disciplines, including signal processing to perform feature
extraction and pattern recognition and linguistics to build acoustic
models. The field of natural language
processing (NLP) provides another source of knowledge needed by
the recognizer, language
models. While acoustic models built from the extracted features
enable the recognizer to decode phonemes that comprise words, the
language models specify the order in which a sequence of words is
likely to occur. For example, a typical greeting might be an
interjection such as "Hello," followed by a noun, "World." Other
words could be substituted for this interjection and noun. The image
below illustrates the speech recognition process, incorporating a language
model to represent a greeting.
As stated, language models have been studied extensively in the field
of NLP. We briefly synopsize relevant aspects of NLP for speech
recognition in Section 6.1.2.
The remainder of this tutorial focuses on two popular language models
for speech recognition, N-grams and
Networks, describing how to
implement them using ISIP software.
|
|