SWITCHBOARD is a corpus of spontaneous conversations which addresses the growing need for large multi-speaker databases of telephone bandwidth speech. The corpus contains 2430 conversations averaging 6 minutes in length; in other words, over 240 hours of recorded speech, and about 3 million words of text, spoken by over 500 speakers of both sexes from every major dialect of American English.

Project Goals

Our plan for resegmenting the database consists of a six step process:

We are using the following CDs in this project:

An overview of the transcription process is given below:

Work flow diagram