name: FilterBank : public AlgorithmBase

synopsis:

g++ [flags ...] file ...-l /isip/tools/lib/$ISIP_BINARY/lib_algo.a

#include <FilterBank.h>

FilterBank();
FilterBank(const FilterBank& arg);
boolean eq(const FilterBank& arg);
boolean setAlgorithm(ALGORITHM algorithm);
boolean setImplementation(IMPLEMENTATION implementation);
boolean set(long order, float sample_freq, ALGORITHM algo, IMPLEMENTATION impl, SCALE scale);
boolean set(float width, float sample_freq, ALGORITHM algo, IMPLEMENTATION impl, SCALE scale);
quick start:

VectorFloat input;
VectorFloat output;
FilterBank fb;
fb.set(24, 8000, FilterBank::FREQUENCY_SAMPLED, FilterBank::TRIANGULAR, Warp::MEL);
fb.compute(output, input);
description:

A filter bank is an important part of the feature extraction process for speech recognition, because it allows the signal to be decomposed into its frequency components in a manner similar to the way the human ear operates. In terms of basic digital signal processing theory, a filter bank performs a time frequency analysis of the signal, much like a wavelet analysis. A filter bank is a useful tool because it can be used to simulate a nonlinear warping of the frequency axis. A good reference for the analysis and implementation of such filter banks is:
This class currently supports two algorithm choices: TIME and FREQUENCY. The TIME algorithm is essentially a bank of digital filters that operate in the time domain to generate a multi-channel output corresponding to each digital filter. The Filter class is used to implement this bank of filters, using the constant coefficient difference equation (CCDE) implementation. Please note that the coefficients of these digital filters are set using a standard Sof-formatted parameter file. The parameter file can be set using the setFiltersParamFile method in this class.

The second algorithm type supported in this class is FREQUENCY. This mode is used to implement a popular approach in speech recognition in which the frequency scale is nonlinearly warped and oversampled to produce cepstral coefficients. The input spectrum is separated in a series of bins, and the frequency samples within these bins are averaged together using an appropriate weighting function. The input spectrum must be a log magnitude spectrum, and can be supplied as either a symmetric spectrum (SYMMETRIC - used for real signals) or a or an asymmetric spectrum (FULL - used for complex signals).

An overview of the options available for this class is given below:


A filter bank is best understood using the follow diagram:

which depicts how the spectrum is sampled in the case of the triangularly weighting option (e.g., algorithm = FREQUENCY, implementation = TRIANGULAR). For most of the frequency sampling methods, overlapping filters are used to produce a smoother estimate of the spectrum. Also, the frequency scale is typically nonlinearly warped (e.g., scale = MEL) to better match human perception. More details on these techniques can be found in our Fundamentals of Speech Recognition course, or our speech recognition workshop notes.

dependencies:

public constants:

error codes:

protected data:

required public methods:

class-specific public methods:

private methods:

examples:

notes: