Front-End Presentation Outline

* fe_main_00: Abstract - Title 

* Introduction (no foil)

  Speech Recognition is just a pattern matching problem. 

** fe_intro_00: Draw a picture of two waveforms for the same word, 
   caption "is this the same waveform?" ... "BAD QUESTION"

  Front end translates acoustic data into observation vectors in the
  probability space.

** fe_intro_01: Draw another picture of two observation vectors -> 
   distance equation -> probability -> a hard number. ... "BETTER QUESTION"

** fe_intro_02: Project objectives

  hit on public domain code implementation of all state-of-the-art
  algorithms

  interface to the ISIP decoder (maybe not)

  expendable structure, tutorial-style code, good documentation, demo
  as a teaching tool

* fe_algo_00: System Overview

  nifty block diagram

** fe_algo_01: Filter Banks

  average spectral magnitude within the filter channel    

  draw the spectrum -> filter bank histogram looking thing.
  
  have filter banks spaced on the mel scale in the plot, say that this
  is to more closely model the human auditory perceptual model
  (logarithmic over 800 Hz)

** fe_algo_02: cepstral
  
  minimum phase representation, model more robust to noise.

  Liftering (maybe)

  show equation. Say these are the state-of-the-art in
  most modern systems.

  Try to find two utterances with the same content but varying noise
  and show cepstrum vs. fba for both.

** fe_algo_03: LP vs. FFT

  LP spectrum. Show spectrum with different LP orders vs. FFT
  spectrum. LP is faster than FFT, but less of an issue now.

  problem with LP model is that it approximates all frequencies
  equally, inconsistent with human perception

** fe_algo_04: PLP
   
   picture of the human vocal tract & ear. (Better resolution, of
	   course).

   New method which attempts to solves LP's biggest problem by spacing
   LP coefficients non-linearly over the signal to more closely match
   human perception

** fe_algo_05: Delta features

   first and second time derivatives of the signal, regression method
   used, increases accuracy by 5% on SWB

* fe_eval_00: Evaluation Design

   frame level classification experiments on subset of SWB and
   Alpha-Digit corpora.

   Need some nice picture of frame comparison (somehow) 

** fe_eval_01: support vector machines. 

   Maybe a nice classification problem plot
   and how SVMs can solve the problem when other methods (PCA, LDA,
   etc) can't. I can borrow Suresh's PCA vs. LDA slide :) 

   Aravind may have a slide I can steal for this. If not, I can
   collaborate with him for a foil he might want for ICSLP.

   This slide may be outside the scope of this presentation. I just
   like the idea of talking about SVM's because they are new. 

** fe_eval_02: Corpus Description (Alphadigits)

** fe_eval_03: Results 0
   
   We will likely have more than one results foil

** fe_eval_04: Results 1

   I'd like to do some sort of ROVER analysis too see if the different
   algorithms extract different information. This could make a nice
   auxiliary slide.

* fe_eval_05: Analysis

* fe_conc_00: Conclusions

* fe_ref_00: References

* Auxiliary information

  These slides are not part of the presentation, but will be available
  in case of the most probably questions.

** fe_aux_00: differences between us and HTK cepstral

** fe_aux_01: Why we didn't just evaluate by running recognition