/ Data / Fundamentals / Production / Tutorials / Software / Home
2.1.5 Speech Files: Microsoft's WAV Format
Section 2.1.5: Microsoft's WAV Format

The WAV audio format was developed by Microsoft and has become one of the primary formats of uncompressed audio. It stores audio at about 10 MB per minute at a 44.1 kHz sample rate using stereo 16-bit samples. The WAV format is by definition, the highest quality 16-bit audio format. It is also the largest at about 10 MB per minute of CD quality audio. The quality can be sacrificed for file size by adjusting the sampling rate, data width (i.e. 8-bits), and number of channels (up to 2 for stereo).

For a comparison of audio formats, see Internet audio. Click here to download an example of a WAV file for our demonstration utterance.

A detailed description of the WAV format can be found here. as well as many other places on the web (e.g., quick and dirty WAV layout). The header is organized as a sequence of fixed format values followed by data. A file can have multiple combinations of header and data, though many files just have one header and one data section.

WAV file support in our environment is provided through a library developed by Silicon Graphics called Audio File Library that provides a uniform programming interface to many standard digital audio file formats. The actual interface to this library is hidden from the user (it is located in the class named AudioFile.

WAV Header Synopsis:

Description Size (Bytes) Usual Contents
RIFF file description header 4 bytes The ASCII text string "RIFF"
size of file 4 bytes The file size LESS the size of the "RIFF" description (4 bytes) and the size of file description (4 bytes).
The WAV description header 4 bytes The ascii text string "WAVE".
fmt description header 4 bytes The ascii text string "fmt " (note the trailing space).
size of WAV section chunk 4 bytes The size of the WAV type format (2 bytes) + mono/stereo flag (2 bytes) + sample rate (4 bytes) + bytes/sec (4 bytes) + block alignment (2 bytes) + bits/sample (2 bytes). This is usually 16.
WAV type format 2 bytes Type of WAV format. This is a PCM header, or a value of 0x01.
mono/stereo flag 2 bytes mono (0x01) or stereo (0x02)
sample frequency 4 bytes The sample frequency.
bytes/sec 4 bytes The audio data rate in bytes/sec.
block alignment 2 bytes The block alignment.
bits per sample 2 bytes The number of bits per sample.
data description header 4 bytes The ascii text string "data".
size of the data chunk 4 bytes Number of bytes of data is included in the data section.
data variable length The audio data.

Section 2.1.5: Microsoft's WAV Format
Table of Contents   Section Contents   Previous Page Up Next Page
      Glossary / Help / Support / Site Map / Contact Us / ISIP Home