What's New Projects Publications Speech Search Up Home
JEIDA COMMON SPEECH DATA CORPUS

Jeida The Japan Electronic Industry Development Association's Common Speech Data (JCSD) Corpus is an isolated phrase corpus consisting of 150 speakers (75 males/75 females) and almost 200,000 utterances. It represents an important milestone in Japanese speech recognition technology development. The JCSD Corpus was originally collected in 1986 in Japan in a nationwide project managed by Professor Shuichi Itahashi in coordination with the Japan Electronic Industry Association (JEIDA). Its importance to Japanese speech recognition technology development is, to some extent, comparable to Texas Instruments' famous 46-word speaker-dependent corpus. The JCSD Corpus was one of the first industry-standard and freely available corpora for the study of Japanese language speech recognition. Most of the competitive Japanese language speech recognition systems developed in Japan have been benchmarked on various subsets of this corpus. Hence, it is one of the most important standards of comparisons that exist for Japanese language systems.

Software: Documentation: Sponsors: [an error occurred while processing this directive]