June / Monthly / Tutorials / Software / Home

A speech recognizer is a program that attempts to decode a digital signal. Certain attributes, e.g., energy, are necessary in the decoding process. The features are numerically measured and stored in a form, called a feature vector, which the recognizer can understand and process. The act of taking these features is known as feature extraction or front-end processing. A speech recognizer simply processes sequences of feature vectors, and is capable of learning about the statistical dependencies between these vectors.

Our transform builder is a program that allows users to extract these features in order to complete the speech recognition process. Transform Builder uses a Graphical User Interface (GUI) to create a signal flow graph which configures the format of the speech input, the algorithms, e.g., energy, for extracting the features, and the format of the output where the features data will be stored. It provides a palette of algorithms, known as components, which you can use as building blocks to generate more complicated procedures. This tool allows you to draw the graph, and then saves it to a parameter file.

This file is supplied as input to a second tool, known as isip_transform, that processes data and performs necessary operations defined in the parameter file. The paragraphs below briefly describe an example of how to build a simple signal flow graph to measure energy and to use the results of the graph to extract energy.

From your working directory, type the command isip_transform_builder. A window should appear as shown in Figure 1.

Transform Builder

You will build the signal flow graph by adding and configuring the Input, Output, and Algorithm components. To add the components, go to the Components menu. Select a needed component; then click in the white area of main panel. Continue this process with each component. In this example I added four components: Input, Window, Energy, and Output. (Refer to Figure 2 on where to place the components.) After every component is in place, right click on each component and select Configuration from the pop up menu. Once the Input, Output, and Algorithm components are in place, connect them using arcs. From the Edit menu select Insert Arc. Then connect two components with the arc by selecting one component and then selecting another component. Continue this process until all of the necessary components are connected in their correct order.

Your completed graph should look similar to Figure 2. The results of creating and configuring the signal flow graph are contained in a recipe that is stored in a file. Now save the graph, as shown, as an Sof file and run isip_transform to perform feature extraction. For more detailed instruction on how to build a signal flow graph using the ISIP Transform Builder, see Section 3 of our on-line tutorial for our speech recognition system.

Save Graph

Transform Builder is a powerful tool and carries with it some unique attributes. It is implemented in Java, and is therefore highly portable across operating systems. The parameter options are defined in a resources file that can be easily edited, and is interpreted at run-time using a parser. This makes it easy to reconfigure the tool.

Transform Builder can be used to explore basic DSP concepts. The inventory of building blocks includes general math and statistics functions that allow users to build arbitrarily complex transformations of the input data. The underlying C++ classes are also designed to make it very easy to add new algorithms and features.