A speech recognizer is a program that attempts to decode a digital signal. Certain attributes, e.g., energy, are necessary in the decoding process. The features are numerically measured and stored in a form, called a feature vector, which the recognizer can understand and process. The act of taking these features is known as feature extraction or front-end processing. A speech recognizer simply processes sequences of feature vectors, and is capable of learning about the statistical dependencies between these vectors. Our transform builder is a program that allows users to extract these features in order to complete the speech recognition process. Transform Builder uses a Graphical User Interface (GUI) to create a signal flow graph which configures the format of the speech input, the algorithms, e.g., energy, for extracting the features, and the format of the output where the features data will be stored. It provides a palette of algorithms, known as components, which you can use as building blocks to generate more complicated procedures. This tool allows you to draw the graph, and then saves it to a parameter file. This file is supplied as input to a second tool, known as isip_transform, that processes data and performs necessary operations defined in the parameter file. The paragraphs below briefly describe an example of how to build a simple signal flow graph to measure energy and to use the results of the graph to extract energy. From your working directory, type the command isip_transform_builder. A window should appear as shown in Figure 1. You will build the signal flow graph by adding and configuring the Input, Output, and Algorithm components. To add the components, go to the Components menu. Select a needed component; then click in the white area of main panel. Continue this process with each component. In this example I added four components: Input, Window, Energy, and Output. (Refer to Figure 2 on where to place the components.) After every component is in place, right click on each component and select Configuration from the pop up menu. Once the Input, Output, and Algorithm components are in place, connect them using arcs. From the Edit menu select Insert Arc. Then connect two components with the arc by selecting one component and then selecting another component. Continue this process until all of the necessary components are connected in their correct order. |
Your completed graph should look similar to
Figure 2. The results of creating and configuring the signal
flow graph are contained in a recipe that is stored in a file. Now
save the graph, as shown, as an Sof file and run isip_transform
to perform feature extraction. For more detailed instruction on how
to build a signal flow graph using the ISIP Transform Builder, see
Section 3
of our on-line tutorial for our speech recognition system.
Transform Builder can be used to explore basic DSP concepts. The inventory of building blocks includes general math and statistics functions that allow users to build arbitrarily complex transformations of the input data. The underlying C++ classes are also designed to make it very easy to add new algorithms and features. |