/ Recognition / Fundamentals / Production / Tutorials / Software / Home

4.3.3 Scoring: Generating Scoring Reports

Before describing the conversion of data to the NIST format, let's briefly discuss the format of the hypotheses produced by our recognizer. The most fundamental and useful output format from a speech recognition system is a tree-like data structure, know as an annotation graph, that contains a time-aligned transcription of the utterance. All levels of information in the hierarchical system that have been used to model this utterance are encapsulated in this graph. An example of such an output is shown to the right. The particular annotation graph format used in our tools is based on a toolkit developed by the Linguistic Data Consortium (LDC). Within the IFCs, you will find a class called AnnotationGraph that represents our implementation of this data structure.

Annotation graphs (AGs) can be used to store either hypotheses or reference transcriptions. In addition to storing labels of the words spoken or hypothesized, annotation graphs allow storing multiple layers of knowledge about each word, such as parts of speech or acoustic labels. Any of the information can be time-aligned, but this is not required. For more details about annotation graphs, see our monthly tutorial archive.

To continue with this tutorial, you will need access to a results file from a previous recognition experiment. For this example, let's use the results from the experiment in Section 4.2.5, results.out.

There are three steps required to score these results:

Conversion of recognition results: use the tool isip_extract_hypo to convert the recognition results, typically stored as an annotation graph in a text Sof file to the NIST scoring format.
Conversion of reference transcriptions: once again, use the tool isip_extract_hypo to convert the reference transcriptions, typically stored as a transcription database, to the NIST scoring format.
Score: run our standard scoring script, isip_eval, which accepts the output of the previous two tools and runs the NIST scoring software to generate the scoring report.

We now describe these steps in details and provide example output. Proceed to Section 4.3.4 to continue this scoring tutorial.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home