July / Monthly / Tutorials / Software / Home

Hi, I'm Bert Hubert. I'm currently building a Pluggable Authentication Module speaker verification system based on the IFC library by ISIP. This month I will introduce you to my extensions to the IFCs that provide support for Microsoft WAV files. I was able to easily add WAV support by leveraging an existing open source library, Audio File Library, developed by Silicon Graphics. AIFF, WAV, Next/Sun .au are other formats currently supported by this library from the same interface.

File support in the IFC environment is localized to one class named AudioFile in the Multimedia library. This class hides details of a specific file format from the user. Many tools, such as isip_make_sof and isip_transform use AudioFile to process data. Such tools can be configured using our GUI-oriented signal flow graph generator called isip_transform_builder. A good tutorial on such tools can be found in our on-line speech recognition tutorial.

In the AudioFile class, there are several file types, compression formats, etc., supported through the use of enumerated variables:
      enum FILE_FORMAT { SOF = 0, RAW, WAV, SPHERE, DEF_FILE_FORMAT = SOF };
      enum FILE_TYPE { TEXT = 0, BINARY, DEF_FILE_TYPE = TEXT };
      enum COMP_TYPE { LINEAR = 0, ULAW, ALAW, DEF_COMP_TYPE = LINEAR };
These variables control whether the file is a text or binary file (a key part of the Sof file format) and the type of compression incorporated in the file (some file types support multiple compression options).

There are also some generic functions that provide a uniform interface to all file types. Examples of these include:
      boolean open(const Filename& filename, MODE mode = READ_ONLY);
      long getData(Vector& data, long start_samp, long num_samp);
      boolean close();
These functions provide necessary capabilities such as reading a chunk of data (multichannel signals are supported), writing data, skipping past the header information, and separation of header data from the actual audio data.

These generic functions in turn call type-specific functions such as:
      long readRawData(Vector& data, long channel_tag = DEF_CHANNEL_TAG, long start_samp = DEF_START_SAMP, long num_samp = DEF_NUM_SAMP);
      long readSofData(Vector& data, long channel_tag = DEF_CHANNEL_TAG, long start_samp = DEF_START_SAMP, long num_samp = DEF_NUM_SAMP);
      long readWavData(Vector& data, long channel_tag = DEF_CHANNEL_TAG, long start_samp = DEF_START_SAMP, long num_samp = DEF_NUM_SAMP);
There are a limited number of private functions in the class that must be implemented for each type (e.g., read, write, open, close).

Adding support for a specific type amounts to supplying implementations for these private functions, and extending the enumerations above for the new type. Since all utilities use AudioFile, once AudioFile has been properly modified, recompilation will automatically add this support to all utilities.

The WAV support will be included in ISIP's next release, scheduled for October 1.