ASR_VA_TUTORIAL_V1.0: SYSTEM OVERVIEW
		-------------------------------------

This file contains a brief synopsis of each step required to train and
evaluate a single-mixture context-independent (monophone) system.

This system was trained on 8 speakers, defined in:

 train_953_v1.0.list

and evaluated on:

 devtest_255_v1.0.list 

using a sentence-pattern grammar. It achieved a word error rate of 
3.4%, documented in:

 hypothesis.report

which is generated at the end of this recipe. Below is a description
of each step required to develop this system starting from unseeded
models.

===============================================================================
A. FILE SYSTEM OVERVIEW

The first step is to unpack the appropriate tar files into a
workspace. Let's assume these have been installed into a directory
called asr_va_tutorial_v1.0.  Set your current working directory to
this place:

 cd /home/users/.../asr_va_tutorial_v1.0

All pathnames described below are relative to this location.

At this location, once you build the software, you will find these
files and directories:

 AAREADME.text			general release information
 GNUmakefile			make file
 GNUmakefile.in			source make file (ignore)
 ISIP_WSJ_ENV.sh		environment variables
 ISIP_WSJ_ENV.sh.in		source env variables (ignore)
 class				C++ object code
 config.cache			configure files (ignore)
 config.guess			configure files (ignore)
 config.log			configure files (ignore)
 config.status			configure files (ignore)
 config.sub			configure files (ignore)
 configure			configure files (ignore)
 data				recognition-related configuration files
 install-sh			configure files (ignore)
 scripts			general purpose scripts (see the appendix)
 util				source code for recognition driver scripts

In the current shell, you must source ISIP_WSJ_ENV.sh using the bash shell:

 source ISIP_WSJ_ENV.sh

You can now proceed to training. For the remainder of this tutorial,
we will also assume you are running on one CPU with the name
"isip105".  At several points in this tutorial you will need to supply
this machine name as part of the command line arguments.

===============================================================================
B. TRAINING OVERVIEW

We need to create a workspace to run experiments. For this
tutorial, let's assume this space exists in a directory called exp:

 mkdir exp;
 mkdir exp/exp_001;
 cd exp/exp_001;

All the paths and directories for training will be created automatically
relative to the "exp_001" directory.

Training can be run using this command:

 wsj_run -train_mfc_list ./train_953_v1.0_mfc.list \
         -cpus_train isip105 | tee train.log

This will run training to its completion and create the necessary
output models. Note that the filename lists must correctly
identify the location of the data ON YOUR MACHINE, and the 
pathname to the filename ("./" in this case) will need
to vary depending on where the list is located on your machine.

The result of the above command is a set of acoustic models.
These are located at:

   train/baum_welch/monophone/mixture/final_models

Below, we will describe all steps leading to this result.
The log file, train.log, will contain a step-by-step
status report of training as it progresses.

B.1 INITIALIZATION

   1) Checking for input data directory and files: checks whether all
      needed data exists (/home/users/.../asr_va_tutorial_v1.0/data)

   2) All training mfc files exist: checks whether all the input
      feature files specified from the command line exists.

   3) Creating local data directories:
      - data_generation
      - train
      - isip105

   4) Creating transcriptions:

      - Create monophone transcriptions without "sp" between words
        corresponding to the input features file list.

	   data_generation/transcriptions/mono_trans_no_sp.text

      - Create monophone transcriptions with "sp" between words
        corresponding to the input features file list.

	   data_generation/transcriptions/mono_trans_with_sp.text

      - Create word transcriptions with "sp" between words
        corresponding to the input features file list.

	   data_generation/transcriptions/word_transcription.text

   5) Creating training lists
   
         data_generation/lists/train_mfc.list
         data_generation/lists/aligned_output.list

   6) Creating training lists for each CPU by dividing the list in N
      ways and moving them to the corresponding CPU directories:
   
         isip105/data_generation/lists/train_mfc.list
         isip105/data_generation/lists/aligned_output.list

   7) Creating transcriptions for each CPU by dividing them in N ways
      and moving them to the corresponding CPU directories:

      - Create monophone transcriptions without "sp" between words:

	 isip105/data_generation/transcriptions/mono_trans_no_sp.text

      - Create monophone transcriptions with "sp" between words:
	
	 isip105/data_generation/transcriptions/mono_trans_with_sp.text

   8) Checking endian-ness of your system: Check the architecture of
      the machine and print it on the stdout. This is important to
      debug the big-endian versus little-endian format features.

B.2 FLAT-START

   1) Building initial base monophone models:

     a) Create monophone models:

        COMMAND: /isip/tools/proto/bin/scripts/create_models -states  \
           train/baum_welch/monophone/base/r00/hmm0/model_lengths.text \
           -output train/baum_welch/monophone/base/r00/hmm0/ \
           models.text"

        For additional information type "create_models -help".

     b) Create create the phone mapping file:

        COMMAND: /isip/tools/proto/bin/scripts/create_triphone_map -mono \
           train/baum_welch/monophone/base/r00/hmm0/monophones.text \
	   -clist train/baum_welch/monophone/base/r00/hmm0/ \
	   monophones.text -context ci -models train/baum_welch/ \
	   monophone/base/r00/hmm0/models.text -output train/ \
	   baum_welch/monophone/base/r00/hmm0/phones.text

        For additional information type "create_triphone -help".

     c) Initialize the monophone models:

        COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/
	   init_hmm \ -input train/data_generation/lists/train_mfc.list \
	   -models train/baum_welch/monophone/base/r00/hmm0/ \
	   model_lengths.text -trans train/baum_welch/monophone/base/ \
	   r00/hmm0/transitions.text -state train/baum_welch/monophone/ \
	   base/r00/hmm0/states.text -mode binary -vfloor_file train/ \
	   baum_welch/monophone/base/r00/hmm0/vfloor.text -var_floor 0.0002 \
	   -num_features 39

        For additional information type "init_hmm -help".

     c) Convert the states from text to binary format:

        COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/ \
	   convert_mmf -input_mode ascii -output_mode binary train/ \
	   baum_welch/monophone/base/r00/hmm0/states.text train/ \
	   baum_welch/monophone/base/r00/hmm0/states.bin

        For additional information type "convert_mmf -help".

B.3 LONG SILENCE MODEL TRAINING (SIL TRAINING)
 
     1) Four passes of single-mixture training without the sp model:

      a) Pass 1:

	 -Generating accumulators on each of the CPU:

          COMMAND: sh isip105 isip105_0/train/baum_welch/ \
	     monophone/base/r00/hmm0/run_bw_train.sh &

	  Look into the shell script "run_bw_train.sh" to observe the
	  inputs and the outputs.

	 -Combining accumulators generated from the previous step:

	  COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/ \
	     bw_train -p train/baum_welch/monophone/base/r00/hmm0/ \
	     params.text
	  
	  Look into the "params.text" to observe the inputs and the outputs.
          For additional information type "bw_train -help".

	  ...
	  ...

      d) Pass 4:


B.4 SHORT SILENCE TRAINING (SP TRAINING)

   1) Initializations for sil model training:

     a) Tying sp model to central state of silence model and adding
        skip states to silence model: This is done though the
        subroutine function "add_sp_and_sil_trans".

	Inputs:  
	   train/baum_welch/monophone/base/r00/hmm0/models.text	
	   train/baum_welch/monophone/base/r00/hmm4/transitions.text
	Outputs:  
	   train/baum_welch/monophone/base/r01/hmm0/models.text	
	   train/baum_welch/monophone/base/r01/hmm0/transitions.text

     b) Create create the phone mapping file for the new models:

        COMMAND: /isip/tools/proto/bin/scripts/create_triphone_map -mono \
	   train/baum_welch/monophone/base/r00/hmm0/monophones.text \
	   -clist train/baum_welch/monophone/base/r00/hmm0/ \
	   monophones.text -context ci -models train/baum_welch/ \
	   monophone/base/r01/hmm0/models.text -output train/ \
	   baum_welch/monophone/base/r01/hmm0/phones.text
        
	For additional information type "create_triphone -help".

   2) Four passes of single-mixture training with the sp model:

      a) Pass 1:

	 -Generating accumulators on each of the CPU:

	  COMMAND: sh isip105 isip105_0/train/baum_welch/monophone/ \
	     base/r01/hmm0/run_bw_train.sh &

	  Look into the shell script "run_bw_train.sh" to observe the
	  inputs and the outputs.

	 -Combining accumulators generated from the previous step:

	  COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/ \
	     bw_train -p train/baum_welch/monophone/base/r00/hmm1/ \
	     params.text
	  
	  Look into the "params.text" to observe the inputs and the outputs.
          For additional information type "bw_train -help".

	  ...
	  ...

      d) Pass 4:

B.5 FORCED ALIGNMENT

   1) Create the phonetic transcriptions from the word transcriptions,
      lexicon and the trained monophone models.

      COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/ \
         trace_projector -p train/baum_welch/monophone/base/ \
         alignments/params.text
    
      Look into the "params.text" to observe the inputs and the outputs.
      For additional information type "trace_projector -help".

   2) Divide these transcriptions in N ways according to the number of
      CPU's and move them to the corresponding CPU directories. Also
      move the corresponding feature files list.
      
         isip105_0/train/baum_welch/monophone/base/transcriptions/ aligned_trans.text
         isip105_0/train/baum_welch/monophone/base/transcriptions/ aligned_mfcc.text

B.6 MONOPHONE TRAINING

   1) Five passes of single-mixture monophone training:

      a) Pass 1:

	 -Generating accumulators on each of the CPU:

	  COMMAND: sh isip105 isip105_0/train/baum_welch/monophone/ \
	     base/r01/hmm4/run_bw_train.sh &

	  Look into the shell script "run_bw_train.sh" to observe the
	  inputs and the outputs.

	 -Combining accumulators generated from the previous step:

	  COMMAND: rsh isip105 /isip/tools/proto/bin/i386-pc-solaris2.7/ \
	     bw_train -p train/baum_welch/monophone/base/r01/hmm4/
	     params.text
	  
	  Look into the "params.text" to observe the inputs and the outputs.
          For additional information type "bw_train -help".

	  ...
	  ...

      d) Pass 5:

   2) Copy final models to 

         train/baum_welch/monophone/mixture/final_models

      train/baum_welch/monophone/mixture/final_models/lexicon.text
      train/baum_welch/monophone/mixture/final_models/models.text
      train/baum_welch/monophone/mixture/final_models/monophones.text
      train/baum_welch/monophone/mixture/final_models/phones.text
      train/baum_welch/monophone/mixture/final_models/states.bin
      train/baum_welch/monophone/mixture/final_models/transitions.text
      train/baum_welch/monophone/mixture/final_models/vfloor.text

===============================================================================
C. EVALUATION OVERVIEW

Evaluation can be run from the same workspace as training.  Let's
assume we are still working from exp_001.  All the paths and
directories for decoding will be created automatically relative to the
"exp_001" directory.

Decoding can be run using this command:

 wsj_run -test_mfc_list ./devtest_255_v1.0_mfc.list \
         -cpus_test isip105 -models_path ./ | tee decode.log

This will run decoding to its completion and create the necessary
output hypothesis. Note that the filename lists must correctly
identify the location of the data ON YOUR MACHINE, and the pathname
to the file list might vary depending on your installation.

The result of the above command is a set of hypotheses and the report
file.  These are located at:

   decode/baum_welch/monophone/mixture/grammar_decoding/output/
   decode/baum_welch/monophone/mixture/grammar_decoding/hypothesis.report

Below, we will describe all steps leading to this result.
The log file, decode.log, will contain a step-by-step
status report of decoding as it progresses.

C.1 INITIALIZATION

   1) Checking for input data directory and files: checks whether all
      needed data exists (/home/users/.../asr_va_tutorial_v1.0/data)

   2) All testing mfc files exist : checks whether all
      the input feature files specified from the command line exists.

   3) Creating local data directories:

         - data_generation
         - decode
         - isip105

   4) Creating testing lists
   
         - data_generation/lists/test_mfc.list

   5) Creating testing lists for each CPU by dividing the list in N
      ways and moving them to the corresponding CPU directories:
   
         isip105/data_generation/lists/test_mfc.list

   6) Creating transcriptions for each CPU by dividing them in N ways
      and moving them to the corresponding CPU directories:

      - Create monophone transcriptions without "sp" between words:

   	   isip105/data_generation/transcriptions/mono_trans_no_sp.text

      - Create monophone transcriptions with "sp" between words:

	   isip105/data_generation/transcriptions/mono_trans_with_sp.text

   7) Checking endian-ness of your system: Check the architecture of
      the machine and print it on the stdout. This is important to
      debug the big-endian versus little-endian format features.


C.2 GRAMMAR INITIALIZATION

    1) Building the lattice from the grammar file:

       COMMAND: 
         rsh isip105 /isip/tools/releases/proto/isip_proto_v5_12_t00/ \
          bin/i386-pc-solaris2.7/grammar_compiler \
          -input asr_va_tutorial_v1.1/ \
          data/decode/grammar.text -output decode/baum_welch/monophone/ \
          mixture/grammar_decoding/grammar.lat

       For additional information type "grammar_compiler -help".

    2) Building the input lattice list:
       
          decode/baum_welch/monophone/mixture/grammar_decoding/lists/ \
          input_lattice.list

    3) Dividing the lattice list N ways according to the number of
       CPU's and moving these to the corresponding directories.

          isip105_0/decode/baum_welch/monophone/mixture/ \
          grammar_decoding/lists/input_lattice.list

    4) Building the output hypotheses list:
       
          decode/baum_welch/monophone/mixture/grammar_decoding/lists/ \
          output.list

    5) Dividing the output list N ways according to the number of
       CPU's and moving these to the corresponding directories.

          isip105_0/decode/baum_welch/monophone/mixture/ \
          grammar_decoding/lists/output.list

C.3 NETWORK DECODING

    Network decoding is accomplished by running the recognizer
    in a mode known as "lattice rescoring":

     COMMAND:
      sh isip105 isip105_0/decode/baum_welch/monophone/ \
      mixture/grammar_decoding/run_trace_projector.sh &

     Look into the shell script "run_trace_projector.sh" to observe the inputs
     and the outputs.
     
C.4 SCORING

    COMMAND:
       /isip/tools/releases/proto/isip_proto_v5_12_t00/bin/scripts/ \
        isip_eval isip_model /decode/baum_welch/monophone/mixture/ \
        grammar_decoding/lists/output.list asr_va_tutorial_v1.1/data/decode/ \
        reference.score decode/baum_welch/monophone/mixture/ \
        grammar_decoding/hypothesis

    The output report is at:    
    decode/baum_welch/monophone/mixture/grammar_decoding/hypothesis.report

===============================================================================
D: APPENDIX

Set your current working directory to this place:

 cd /home/users/.../asr_va_tutorial_v1.0

 This appendix provides overview of the source code, and the various
 files needed for training the monophone models and network grammar
 decoding. All these files correspond to the Creare Phase 1 data.

D.1 COMMAND LINE PARSING: scripts/perl/command_line/command_line.pm

    All utilities use the same command line interface, written in perl.
    The perl code is located in the module command_line.pm.

D.2 SUBROUTINES: scripts/perl/wsj_subroutines/wsj_subs.pm.in

    All the utilities use the subroutines from the subroutine file,
    written in perl. The perl code is located in the module
    wsj_subs.pm.in.

D.3 UTILITIES: util/wsj_scripts/wsj_*.pm.in;/util/check_endian/check_endian.cc

    All the driver utilities are written in perl. The C++ code to
    check the endianness of the native architecture is located in
    module check_endian.cc.

D.4 CONFIGURATION: data
   
    All overview of the files needed to train the monophone models
    from scratch and decode is provided in this section.

    A) Training setup: data/train

      i) Monophone training setup: data/train/monophone

	a) Single mixture base: data/train/monophone/base
	   
	   - Monophones listing: data/train/monophone/base/monophones.text

	   - Monophones topology: data/train/monophone/base/model_lengths.text

	   - Special models: data/train/monophone/base/special_models.text

	   - Monophone transcriptions without "sp" between words
             corresponding to the entire database:
             data/train/monophone/base/all_mono_trans_no_sp.text

	   - Monophone transcriptions with "sp" between words
             corresponding to the entire database:
             data/train/monophone/base/all_mono_trans_with_sp.text

	   - Word transcriptions corresponding to the entire database:
             data/train/monophone/base/all_word_transcription.text

	   - Lexicon: data/train/monophone/base/lexicon.text

	b) Multiple mixture: data/train/monophone/mixture
	   
	   - Special models: data/train/monophone/mixture/special_models.text

   B) Decoding setup : /data/decode
      
      - Lexicon: /data/decode/decode_lexicon.text

      - Grammar: /data/decode/grammar.text

      - Reference transcriptions corresponding to the decoding database: 
	/data/decode/reference.text
===============================================================================