Inverse Frequency Sound format
The Inverse Frequency Sound format is the format used by Apogee/id Software in many of their games for the PC sounds. It is named because the bulk of the data is stored as 'inverse frequency' values (the higher the value, the lower the tone it produces.) Most commonly this is stored in a SOUNDS.xxx file, though it may also be stored internally in the game executable.
Later games, such as Wolfenstein 3-D use a modified form of this for their (less important) PC speaker and Adlib sounds.
The file is divided into three sections, 16 bytes of header, a list of sound names and the actual sound data. The start of the sound data should thus be
(count + 1) * 16 bytes from the file start.
|char||signature||"SND" + terminating null. Indicates the start of a sound file, whereas "SPK" + $00 indicates the start of the older 'speaker' format, which has very few differences. "SSE" + terminating null in Space Pizza.|
|UINT16LE||size||Size of file|
|UINT16LE||unknown||Usually 0x003C, but doesn't appear to do anything.|
|UINT16LE||count||Number of sounds. For SPK files this is blank as the number of sounds is always 63.|
|BYTE||pad||Nulls to pad structure up to 16 bytes.|
This structure is repeated once per sound.
|UINT16LE||offset||Offset of the sound from the beginning of the file.|
|UINT8||priority||Whether or not sound will be interrupted by another sound if said sound starts playing while the first is. Sounds can only be interrupted by sounds that have an equal or higher value of this. 255 is max, 0 is inadvisable.|
|UINT8||rate||Defines the update rate of the timer generating the sound interrupt. Usually set to 8.|
|char||name||Null-padded sound name|
Sound data is divided into words (UINT16LE), with the word value being inversely proportional to the sound frequency. The sound frequency in Hz can be calculated as follows:
frequency = 1193181 / value
These word values are written directly to PIT Channel 2. The PC Speaker is updated at a rate of usually 140 Hz, so each word value is around 1/140th of a second of tone. Most files contain a few seconds of sound. The value $FFFF signals the end of a sound and most values are between the range $0100-$5000. $0000 is silence and will cause the PC Speaker to be turned off.
Depending on the implementation, the update rate might differ. Catacomb will set the speed for each sound according to the sound's rate value (rates 0 and 1 are about 18.18 Hz, rate = 8 is 140 Hz). Hovertank 3-D is hard-coded to use a fixed rate of 140 Hz and ignore each sound's rate value. Duke Nukem uses a fixed rate of 144 Hz.
Later games use a similar format for PC sounds, where the sound data is in bytes with values of 0-255. Multiplying those byte values by 60 basically converts them to word values that can be written directly to PIT Channel 2. The loss of fine-tuning can probably be attributed to the lesser role of PC sounds in the AudioT Format, which has a totally different way of reading the sounds, and also contains AdLib sound effects and music.
Most games load the entire file into memory and ignore all values in the header. The sound names are ignored, too. Since the games use hard-coded sound numbers (usually a 1-based index), seeking to 16*soundnumber will get the sound data offset, priority and rate.
Due to a bug in the implementation, Duke Nukem uses the low byte of the sound's offset value as the priority for the sounds read from DUKE1-B.DN?.
This format has been reverse engineered many times, most often in the Commander Keen 1-3 fan community. Probably first by Anders Gavare. If you find this information helpful in a project you're working on, please give credit where credit is due. (A link back to this wiki would be nice too!)