Downloads:

(06/10/02) DARPA Communicator Recognition Server v2.0: This release features some enhancements, including: n-best hypothesis list support, capacity of receiving "GAL_BINARY" format audio data, ability to compile on both Sparc and Pentium machines. In order to run this server, one must install the v5.11 or greater of our prototype system. Detailed installation and verification instructions are available on-line.
(11/16/01) Multiple-CPU Eval Package (v1.0): This package allows users to easily run complete experiments using multiple cpus. In addition to the parameters available in the single-cpu scripts, the user can specify which computers are to be used for training and testing. Please be sure that you have already installed the ISIP prototype system (v5.11) before running this application. To install this package, follow the instructions below. Detailed instructions are included in the release's AAREADME.text file.
- tar xzvf asr_tutorial_v1.0.tar.gz
- cd asr_wsj_tutorial_v1.0
- source <install directory for v5.11>ISIP_ENV.sh
- ./configure --prefix=.
- make
- make install
- source ISIP_WSJ_ENV.sh
- wsj_run -help
A typical command line for decoding will look like this:
Or you can easily run decoding using a specified list of raw files with our embedded Switchboard models (12 mixture word-internal triphone models, bigram language model):
The option "-cpus_test" instructs the utility which computers are to be used for decoding. If a cpu has multiple processors, you can repeat its name N times for N processors. In order to score the hypotheses successfully, the reference file data/decode/reference.score needs to be modified. Regardless of the success of scoring, a hypothesis list will be output to decode/baum_welch/wint_tri/bigram_decoding/hypothesis.score. This hypothesis list corresponds to the raw file list.
(7/09/01) DARPA Communicator Recognition Server: This server provides speech recognition functionality in the DARPA Communicator client/server architecture. A simple demo is included that allows you to decode audio data and display the resulting hypothesis in a text window. The core recognition technology features a real-time Resource Management system. In order to run this server, one must install the v5.9 or greater of our prototype system. Detailed installation and verification instructions are available on-line.
- (7/09/01) DARPA Communicator Recognition Server v1.0 (x86 version)
- (7/24/01) DARPA Communicator Recognition Server v1.0 (SPARC version)
- (7/25/01) DARPA Communicator Recognition Server v1.1 (SPARC version): We added a printout server to demo. Recognizer will send the hypothesis text to HUB which will send this hypothesis text to the printout server.
(05/15/01) Real-Time Resource Management System: This system is derived from our baseline system, but uses a 15 ms frame duration, and achieves a WER of 5.0%. It runs 1.1 xRT on a 600 MHz Pentium Processor. It is described in more detail in a technical report summarizing our progress in the first semester of this project.
(05/10/01) State-Of-The-Art Baseline Resource Management System: The DARPA Resource Management (RM) task has a vocabulary on the order of 1000 words and a perplexity less than 60. The acoustic front end for this system uses our standard MFCC front end with 10 ms frames. The final models are Gaussian crossword triphone models that use six mixtures per state. This system achieves a WER of 3.4% running at 8.3 xRT on a 600 MHz Pentium Processor. More details on our baseline experiments can be found in a technical report describing our baseline system.