The sound support for the Doom engine was programmed by Paul Radek. Because Id Software did not own this sound library, the original DOS source code could not be distributed freely, and the sound system had to be removed first.
Sound lumps in the WAD file are stored in the DMX format; which consists of a short header followed by raw 8-bit, monaural (PCM) unsigned data, typically at a sampling rate of 11025 Hz, although some sounds use 22050 Hz. Each sample is one byte (8 bits).
|0x00||unsigned 16-bit LE int||Format number (must be 3)|
|0x02||unsigned 16-bit LE int||Sample rate (usually, but not necessarily, 11025)|
|0x04||Unsigned 32-bit LE int||Number of samples + 32 pad bytes|
|0x08||Unsigned 8-bit array||16 pad bytes|
|0x18||Unsigned 8-bit array||Samples|
|0x??||Unsigned 8-bit array||16 pad bytes, immediately following samples|
Note that the fact that the sample is padded was a largely unknown fact prior to work undertaken during the Chocolate Strife project, and was only obtained by reverse engineering portions of the DMX library. As a result, many end-user created sound tools for Doom engine games do not properly pad the samples when converting to this format. The result is sometimes audible as a click or pop at the beginning and/or end of the sample. In the official IWADs, the pad bytes are always filled with the value of the first actual sample at the beginning, and the last actual sample at the end.
The Doom engine also provides for PC speaker sound effects, which consist of various tones played through the PC speaker. Other than playing without sound, this was the only option for those who did not own a sound card when the game was released. The sound effect format for PC speaker sound is roughly similar, but does not contain the sample rate; instead a constant sample rate of 140 Hz is assumed.
|0x00||unsigned 16-bit LE int||Format number (must be 0)|
|0x02||unsigned 16-bit LE int||Number of samples|
|0x04||Unsigned 8-bit array||Samples|
A sample has a value between 0 and 127 inclusive. 0 corresponds to silence; 1 is the note F-3 (175 Hz), 33 is A-4 (440 Hz), and 127 is (roughly) the note G#-8 (6666 Hz). There is one value per quartertone. A more precise table of each sample value's note and frequency can be found in the format description written by Simon Howard and Andrew Apted.
In theory, a Doom-format sound can have any sample rate between 1 and 65535 Hz. In practice, they are nearly all at 11025 Hz. A few sounds, however, are sampled at 22050 Hz. A list of them follow:
- DSITMBK (item respawn sound; normally only heard in multiplayer)
- DSDBOPN (super shotgun opening sound)
- DSDBCLS (super shotgun closing sound)
- DSDBLOAD (super shotgun loading sound)
There are no such exceptions in Chex Quest, Hacx, Heretic, and Hexen: all sounds in these games are sampled at 11025 Hz. Interestingly the four 22050 Hz sounds from Doom II have been downsampled to 11025 Hz in Final Doom; there are therefore no exceptions either in TNT.WAD and PLUTONIA.WAD.