/ Recognition / Fundamentals / Production / Tutorials / Software / Home

4.3.2 NIST Scoring: Scoring Reports

NIST provides standard scoring software which automatically computes the WER metrics described in the previous section. The software computes the number and type of errors in each sentence and provides a detailed listing for each category of error. An example of a NIST scoring report can be seen on page 2 of lecture 43. The process of scoring is explained in detail in this lecture on evaluation metrics from our on-line speech recognition course notes.

NIST scoring also computes simple statistics, such as overall percentages for each category of error, speaker-specific error rates, and significance measures that indicate whether an experimental result is meaningful. One useful output is a list of confusion pairs which simply show, for a given pair of words, the number of times one word was mistakenly recognized or "confused" with another word. For example:

13 -> five ==> oh indicates that 13 times, the word "five" was mistaken for the word "oh".

The NIST scoring software requires all reference texts of the actual sentences spoken as well as the corresponding hypotheses produced by the decoder. Both the reference texts and the hypotheses must be properly formatted. We provide tools to obtain and convert this information to the proper format.

Continue to Generating the NIST Scoring Report in the next section to learn how to convert reference texts and hypotheses to the NIST format for automated scoring and generate reports, and how to interpret the results.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home