Downloads:
  • (06/10/02) DARPA Communicator Recognition Server v2.0: This release features some enhancements, including: n-best hypothesis list support, capacity of receiving "GAL_BINARY" format audio data, ability to compile on both Sparc and Pentium machines. In order to run this server, one must install the v5.11 or greater of our prototype system. Detailed installation and verification instructions are available on-line.

  • (11/16/01) Multiple-CPU Eval Package (v1.0): This package allows users to easily run complete experiments using multiple cpus. In addition to the parameters available in the single-cpu scripts, the user can specify which computers are to be used for training and testing. Please be sure that you have already installed the ISIP prototype system (v5.11) before running this application. To install this package, follow the instructions below. Detailed instructions are included in the release's AAREADME.text file.

    • tar xzvf asr_tutorial_v1.0.tar.gz
    • cd asr_wsj_tutorial_v1.0
    • source <install directory for v5.11>ISIP_ENV.sh
    • ./configure --prefix=.
    • make
    • make install
    • source ISIP_WSJ_ENV.sh
    • wsj_run -help

    A typical command line for decoding will look like this:

      wsj_run -num_features 39 -feature_format isip_proto -test_state_beam_pruning 250 -test_model_beam_pruning 200 -test_word_beam_pruning 200 -test_raw_list ./swb_raw.list -models_path data/final_models -cpus_test isip210 isip211 isip212 isip212

    Or you can easily run decoding using a specified list of raw files with our embedded Switchboard models (12 mixture word-internal triphone models, bigram language model):

      wsj_run -test_raw_list swb_raw.list -cpus_test isip210 isip211 isip212 isip212

    The option "-cpus_test" instructs the utility which computers are to be used for decoding. If a cpu has multiple processors, you can repeat its name N times for N processors. In order to score the hypotheses successfully, the reference file data/decode/reference.score needs to be modified. Regardless of the success of scoring, a hypothesis list will be output to decode/baum_welch/wint_tri/bigram_decoding/hypothesis.score. This hypothesis list corresponds to the raw file list.

  • (7/09/01) DARPA Communicator Recognition Server: This server provides speech recognition functionality in the DARPA Communicator client/server architecture. A simple demo is included that allows you to decode audio data and display the resulting hypothesis in a text window. The core recognition technology features a real-time Resource Management system. In order to run this server, one must install the v5.9 or greater of our prototype system. Detailed installation and verification instructions are available on-line.


  • (05/15/01) Real-Time Resource Management System: This system is derived from our baseline system, but uses a 15 ms frame duration, and achieves a WER of 5.0%. It runs 1.1 xRT on a 600 MHz Pentium Processor. It is described in more detail in a technical report summarizing our progress in the first semester of this project.

  • (05/10/01) State-Of-The-Art Baseline Resource Management System: The DARPA Resource Management (RM) task has a vocabulary on the order of 1000 words and a perplexity less than 60. The acoustic front end for this system uses our standard MFCC front end with 10 ms frames. The final models are Gaussian crossword triphone models that use six mixtures per state. This system achieves a WER of 3.4% running at 8.3 xRT on a 600 MHz Pentium Processor. More details on our baseline experiments can be found in a technical report describing our baseline system.