/ Recognition / Fundamentals / Production / Tutorials / Software / Home
4.3.5 Scoring: Evaluating the Scoring Report
Section 4.3.4: Converting Recognition Output

The scoring report produced in the previous section contains a detailed analysis of the comparison of the recognition results and the actual transcriptions of the speech data.

Open the file results.report. The first section is labeled SENTENCE RECOGNITION PREFORMANCE and lists statistics based on the individual utterances. The first of statistics is the percentage of sentences with errors. Next, the error percentage is explained in detail. The percentage of substitutions, deletions, and insertions are each listed individually. Substitutions, deletions, and insertions are explained in Section 4.3.1.
SENTENCE RECOGNITION PERFORMANCE

 sentences                                         336
 with errors                             10.1%   (  34)

   with substitions                       0.6%   (   2)
   with deletions                         0.0%   (   0)
   with insertions                        9.5%   (  32)
The next section is labeled WORD RECOGNITION PERFORMANCE and provides a detailed analysis of the results for each hypothesized word. The first percentage in this section corresponds to the total number of words with errors associated with them. In this report, we see that the Percent Total Error is 3.6% and the Percent Correct is 99.8%. At first, these results seem to contradict themselves. However, several lines down in the report we see the percentange of specific types of errors that occurred. Some of these different types of errors (substitutions, insertions, and deletions) contribute to the percent total error, but do not affect the percent of the words that were answered correctly. In other words, 99.8% of the words in the reference transcription are also in the hypothesis transcription, but the hypothesis transcription contains 3.6 percent of words that are in error. This is confirmed by the two lines beginning with Ref. words and Hyp. words, which show that there are 1121 words in the hypothesized transcription, but only 1084 words in the reference transcription.
WORD RECOGNITION PERFORMANCE

Percent Total Error       =    3.6%   (  39)

Percent Correct           =   99.8%   (1082)

Percent Substitution      =    0.2%   (   2)
Percent Deletions         =    0.0%   (   0)
Percent Insertions        =    3.4%   (  37)
Percent Word Accuracy     =   96.4%


Ref. words                =           (1084)
Hyp. words                =           (1121)
Aligned words             =           (1121)
Next, occurances of each type of error are enumerated and the words associated with each error are listed. For example, if we wish to know more details on the insertions errors for this experiment, we see the following explanation:
INSERTIONS                       Total                 (5)
                                 With >=  1 occurances (5)

   1:   32  ->  oh
   2:    2  ->  eight
   3:    1  ->  one
   4:    1  ->  six
   5:    1  ->  two
     -------
        37
The next section is labeled DUMP OF SYSTEM ALIGNMENT STRUCTURE. First, the speakers are listed. The speaker name depends on the utterance's filename. For instance, the first file used in this experiment is called ah_111a. The characters preceding the underscore character in the filename will be used as the speaker name. After the list of speakers, each speaker is listed once again followed by the sentence spoken by each speaker. The referenced and the hypothesized sentences are both listed in consecutive lines for easy comparison. The line labeled "Scores:" lists the recognition results of the corresponding sentence. The symbol #C stands for the number of words correct, #S for the number of substitutions, #D for the number of deletions, and #I for the number of insertions.
		DUMP OF SYSTEM ALIGNMENT STRUCTURE

System name:   results.score

Speakers: 
    0:  ah
    1:  ar
    2:  at
    3:  bc
...

Speaker sentences   0:  ah   #utts: 18
id: (ah_111a)
Scores: (#C #S #D #I) 3 0 0 0
REF:  one one one 
HYP:  one one one 
Eval:             

id: (ah_1a)
Scores: (#C #S #D #I) 1 0 0 0
REF:  one 
HYP:  one 
Eval:     

...
The final section labeled SYSTEM SUMMARY PERCENTAGES by SPEAKER contains two tables which summarize the recognition results. The first table contains percentage results while the second table contains counted results. Overall, the scoring report provides a very detailed analysis of the recognition performance on a set of data.
   
Table of Contents   Section Contents   Previous Page Up Next Page
      Glossary / Help / Support / Site Map / Contact Us / ISIP Home