Zeng Yu
Institute for Signal and Information Processing
Mississippi State University
Phone/Fax: 601-325-8335 Email: zeng@isip.msstate.edu
URL: /research/isip/resources/seminars/isip_weekly/1998/switchboard_review
Abstract:
The SWITCHBOARD Corpus has become one of the most important benchmarks
for assessing improvements in large vocabulary conversational speech
(LVCSR). The high error rates on SWB are largely attributable to an
acoustic model mismatch, the high frequency of poorly articulated
monosyllabic words, and large variations in pronunciations. An
improved quality of segmentations and transcriptions translates well
to improved acoustic modeling. The goal of our SWB resegmentation
project is to (1) resegment the data into utterances of approximately
10 seconds in duration using boundaries based on naturally-occurring
silence and linguistic phrase boundaries, and to (2) correct the
transcriptions. A system trained on a subset of this data resulted in
a 1.9% absolute reduction in word error rate. Equally exciting is the
fact that recognition error rates on monosyllabic words dropped from
49.6% to 47.7% - a decrease of 1.9%. Since monosyllabic words
dominate the SWB corpus, this is a particularly significant result.
In this seminar, we will focus on changes made to the transcription and segmentation process over the past three months to improve the quality and consistency of the data. Numerous challenging transcription examples will be presented and discussed.