You are here: / Features / Fundamentals / Production / Tutorials / Software / Home

 3.1.3 Overview: Simple Transformations One way of viewing the process of feature extraction is through simple block diagrams. The diagram below illustrates the process of extracting a single feature, energy, from speech data stored on a computer. The block on the far left, labeled Inp (Input), represents speech data stored in digital form on a computer. The center block, labeled Engy(Energy), represents a computer program or algorithm specifically designed to measure energy values in the speech data. This algorithm is applied to the speech data. The measurements are then stored in a computer file of features measurements, represented by the block on the far right labeled Out (Output). Note that the above diagram does not illustrate the use of windows. As previously discussed, this technique is always applied in practice. The diagram below includes a block labeled Wind (Window). This represents a windowing algorithm that is applied to determine the number of samples used to calculate the energy measurements. Note that the blocks indicate special algorithms applied to extract features from the speech data. Special algorithms are also used to input the speech to the feature extraction algorithms and output the features extracted in a computer file. Typically, the frame duration is set in the input algorithm. Energy is considered a time domain feature since it can be computed using the sum of the squared values of the sampled speech data. We can also view the frequency spectrum of a speech signal by converting it using mathematical techniques. As mentioned, the Fourier Transform is a commonly used technique for converting signals from the time domain to the frequency domain. The block diagram below illustrates the process of computing the frequency spectrum for a speech signal using a window of samples. The blocks labeled Inp, Out, and Wind are described above. The block labeled Spec (Spectrum), represents the Fourier Transform, which converts the speech signal to a frequency spectrum. This technique is commonly used to compute spectrograms. See Section 3.1.1 for an example spectrogram.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home