EGAGraph, short for 'EGA graphics library' is the format used to store graphics, demos, fonts, game texts and more in many early id Software games.
The basic format is that of a number of Huffman Compression compressed sub-files (Chunks) stored in a similar manner to the AudioT Format. Like this format it has three main files, EGAGRAPH, EGAHEAD and EGADICT.
It is notable that while the EGAHEAD and EGADICT are stored in the game executable, they need to be extracted for TED5 to edit levels.
This article may contain errors as some formats have not been worked out completely.
EGA Head (EGAHEAD.xxx)
There are two versions of this format. The first version, used in Commander Keen Dreams, Dangerous Dave 3 and Dangerous Dave 4, is an array of 4-byte little-endian values. The second version, used in all later EGA games, is an array of 3-byte unsigned little-endian values.
This header file may be present as an external file (eg. Bio Menace), or it may be included in the EXE file (eg. Catacomb 3-D). The last entry in the header is always the size of the EGAGRAPH.xxx file, and the first entry in the header is always 00 00 00. To find the end, you can search for filesize(EGAGRAPH.xxx) as a 3-byte (little-endian) value in the EXE. Then to find the start, go back 3 bytes at a time until you get to 00 000 00. Or use the fact that entries always get smaller as you go back, unless the entry is $FFFFFF.
In Catacomb 3-D v1.00 EGAHEAD.C3D is at $1BFD0 from the start of the decompressed (unlzexe) EXE file, and is 1437 bytes long, or 479 3-byte entries (for 478 chunks). It is then followed by 3 zero bytes (padding to multiple of 16?) 00 00 00, then the MAPHEAD.C3D.
This header file stores the offsets (relative to the start of the EGAGRAPH file) for each sub file. The format is trivial, in that the header file is simply an array of 3 or 4 byte variables. (Each variable is a "slot", as the game refers to files by index.) Most games use the more compact 3-byte variables, since the EGAGRAPH file is never larger than $FFFFFF. However games based on older engines, notably Commander Keen Dreams, Dangerous Dave 3 and Dangerous Dave 4, use 4-byte vales as these are easier to deal with.
It is important to note that not every 'slot' will be in use. Tiles especially if blank (All black) will be considered 'empty' and not worth adding to the file. (A blank 16x16 tile takes up 12 bytes of space if compressed, and there can be hundreds.) There are two ways of dealing with this; by default the games set all empty slot headers to -1, but many programs compress the graphics anyway and use a normal header value for them.
The last offset in the file will be an offset to the end of the main EGAGRAPH file. (So this should be ignored when reading the file to prevent a zero-byte file appearing last.) The total size of the EGAHEAD divided by 3 (Or 4) will also give the number of 'slots' the game uses. This varies, but is usually about 10,000.
The same Huffman Compression scheme used elsewhere in the Commander Keen series is used. When reading compressed chunks out of the EGAGRAPH file, the first four bytes located at the EGAHEAD offset are a UINT32LE specifying the file's decompressed size (which is required for the decompression algorithm.) The compressed data then follows, EXCEPT for 16x16 masked and unmasked tiles. Since these all have a similar decompressed size (128 bytes for unmasked, 256 bytes for masked) it is a waste of space to include these (Since tiles make up the bulk of graphics slots.) The decompressed size is hard coded, as are the start and finish of the unmasked and masked tile slots.
Note that the EGADICT file is not often obvious, usually being embedded in the main exe file. It is possible to locate it by looking for the string $FD $01 $00 $00 $00 $00, which appears at the end of nearly all Huffman dictionaries. They are the last 6 bytes out of 1024 (256*2*2), so add 6 and subtract 1024 to get the start (pointer to the head node).
Executables usually contain two or more dictionaries, but the EGADICT is usually the second one, except in the case of early games like Keen Dreams, where it is the first of THREE. (A simple check of whether it decompresses the data sensibly works.)
In Catacomb 3-D EGADICT.C3D is at offset $24464 from the start of the decompressed CAT3D.EXE, which is the second Huffman dictionary and comes immediately after the first Huffman Dictionary.
In Catacomb Abyss v1.13 EGADICT.ABS is at offset $2734C from the start of the decompressed CATABYSS.EXE, which is the second Huffman dictionary.
In Catacomb Apocalypse v1.00b EGADICT.APC is at offset $26B24 from the start of the decompressed CATAPOC.EXE, which is the second Huffman dictionary.
Main file (EGAGRAPH.xxx)
This file is simply an array of data files. Each file starts at the offset specified in the EGAHEAD file, and as there are no filenames each file is referred to by its index/slot number. Slots containing dummy values in the header are treated as if they don't exist in the EGAGRAPH file. Each file within the EGAGRAPH file is individually compressed. A file can be read by opening the EGAGRAPH file, seeking to the offset specified in the EGAHEAD file, reading a UINT32LE for the decompressed file size, and then decompressing the data from that point onwards using standard Huffman decompression techniques.
Note that in early games such as Keen Dreams, the EGAGRAPH file may have another name. (Such as KDREAMS.EGA in that case.) in which case it is always the largest non-executable file.
Each sub file or chunk is a separate graphic, ordered in the EGAGRAPH file in a specific way, much again like the AudioT Format. Chunks are labeled starting at 0. The game executable is hard coded for the start and finish of various chunk types. For example, Keen 4 has chunks 124-520 as sprites and will treat any chunk between these two numbers as a sprite. (Note that the game refers to chunk number, not sprite number.) The usual order of chunks is:
Picture table Masked picture table Sprite table Fonts Pictures (Unmasked bitmaps) Masked pictures Sprites 8x8 unmasked tiles (Single chunk) 8x8 masked tiles (Single chunk) 16x16 unmasked tiles 16x16 masked tiles 32x32 unmasked tiles (Optional) 32x32 masked tiles (Optional) Misc graphics (Optional) Game texts Demo files (Optional) Misc data (Optional)
When compressed each chunk consists of a dword giving the decompressed size followed by the compressed data.
Always present and always the first chunk, the picture table is 4 * numpics bytes long, consisting of two words for each picture (Width\8, Height) used when displaying picture chunks.
Divide the decompressed size by 4 to get numpics.
Masked picture table
Always present and always the second chunk, the picture table is 4 * numpics bytes long, consisting of two words for each picture (Width\8, Height) used when displaying masked picture chunks. This table is often quite small.
Always present and always the third chunk, the sprite table is 18 * numspr bytes long, consisting of nine words for each sprite image (width\8, height, x offset, y offset, clipping rectangle left, top, right and bottom, and shifts; all in that order.) used when displaying sprite chunks. This is what Modkeen exports as its notoriously hard to use xSPRITES.txt file.
Always present and always starting at the fourth chunk. (Chunk 3!) Each chunk is a single monochrome font, containing all the entries from 0-255 for that font. Each 'letter' is stored separately and consecutively. A game will have at least one, and usually 3 fonts. The format is version two of EGA Font format
Fonts are used in the game for game texts. They can be stretched or colored and are always transparent, meaning they look strange when not over a single-color background.
These follow game fonts. You need to know how many fonts there are before it to know how what chunk number the first picture is (3+NumFonts).
Bitmaps are stored as standard Raw EGA data, the size of each plane being the decompressed chunk size \ 4. In order to be displayed properly these need information from chunk 0, the picture table. Note that unlike fonts, data does not wrap, so a 4x4 sprite will consist of four planes of four bytes in size, the same as an 8x4 sprite.
Pictures are used for things such as title screens or pictures in game texts. Most of the main menu will consist of pictures. They are also used for the wall textures and sprites in EGA FPS games like Catacomb 3-D, in which case the wall textures usually come near the end of the picture list.
The number of picture chunks is given in chunk 0, the picture table.
Masked pictures (Masked bitmaps)
These follow Pictures. You need to know how many fonts there are, to know how what chunk number the first Masked picture is (3+NumFonts+numpics).
These are identical to the picture chunks in every respect, except being masked, they consist of five EGA planes, not four. Most games have only one or two of them. They need information from chunk 1, the masked picture table, to be displayed correctly.
The number of Masked pictures is given in Chunk 1.
The first Sprite chunk is at 3+numfonts+numpics+numMaskedPics.
Sprites are exactly identical to masked pictures in format except the game cannot stretch or warp them. Most games use these for enemy, player or item graphics during gameplay. They need information from chunk 2, the sprite table, both to display correctly and to interact correctly, something more complex than masked pictures.
There are usually several hundred sprite chunks, although that's not always the case, as Catacomb 3-D only has 3 sprite chunks (used for the PaddleWar game), and Catacomb Abyss has only one (used for the in-game radar). The number of sprite chunks is given in chunk 2.
8x8 tiles are used by most games in status windows or foe the borders of message windows in-game. (And occasionally by TED5 to display levels.) There are thus not many of them. ALL the masked or unmasked tiles are one single chunk stored as four and five plane Raw EGA data respectively. The number of 8x8 unmasked and masked tiles is hard-coded into the executable, but can be worked out by dividing the decompressed chunk size by 32 and 40 (The size of one tile's data) respectively.
16x16 and 32x32 tiles
32x32 tiles are optional and rarely used. 16x16 tiles are only occasionally absent and for most games make up the bulk of graphics. There are usually several thousand entries. Both come in masked and unmasked kinds, just like 8x8 tiles.
These chunks have notable differences. Each tile is an individual chunk and stored as four or five plane Raw EGA data with a plane size of 128 (16x16) or 512 (32x32). These chunks do NOT have a dword specifying their decompressed size as this is hard-coded into the executable as a space saving measure. (As is the start and finish of unmasked and masked tile chunks.) This can cause problems for editing programs.
Tiles make up the bulk of 2D game levels but are seldom seen elsewhere.
Sometimes following tile graphics are miscellaneous graphics, that cannot be handled by the other chunk types. For example, Keen 4-6 has two misc graphics files used for the 'COMMANDER KEEN terminator text' intro. These graphics usually have unique formats that fit their function and few have been investigated.
Keen 4-6 Intro Bitmaps
This is the text that displays after the 'Ready, press any key' screen at Keen startup. It is composed of a monochrome bitmap that is scrolled across the screen and distorted for special effects. When displayed it is transparent, with a special palette so it is not black-and-white. It consists of a header followed by RLE compressed monochrome data.
HEADER: 0 2 Img height Height of the image in pixels 2 2 Img width Width of the image in pixels 4 2x Line point Pointers to line 1,2,3..etc of data. There will be [Img height] of these, each 2 bytes long. The first pointer will have the value (2 * [Img height] + 4) +2 ? RLE data RLE-WM compressed data
DATA: 0 2 Black run Number of black pixels to write 2 2 Not-blk run Number of not-black pixels to write 4 2 Black run.... ... .. . ? 2 End $FFFF; end of row.
Also known as ANSII chunks, these are used by the game for things like help screens, and have their own format. In the main they are simply text documents, interspersed with various commands to make things happen. Each file is divided up into a number of 'pages' which are moved between by pressing the up/down arrows. Pages can also have action sequences (Repeated each time the page is moved to, you cannot leave a page until the sequence finishes.)
^P First command in every file. Defines a page start ^E Ends the file ^Cx Change font color to $x until next page or ^C command ^Gx,y,z Display (unmasked) picture chunk z at location x,y (in pixels) ^Tx,y,z,t Display picture chunk z at x,y for z clicks of time ^Bx,y,z,t,b Fill a width-by-height-pixel rectangle at pixel location x,y (for z clicks of time?) with color $x ^Lx,y Start text alignment at pixel location x,y.
These files, which can be recorded in several games using the 'demo cheat' consist of a series of commands that are played back when a demo is run ingame. (The game is also 'derandomized' during recording so all demo playbacks will be predictable.)
The format is quite simple; consisting of two words and a number of byte pairs. The first two bytes store the number of the level to load for the demo, the next two store the length of data to load\play (After this limit is reached the demo will end, whether or not there is any more data to read.) The remaining byte pairs each encode a time and a keypress. The following values are known:
01 Left 04 Up 05 Nothing 06 Down 09 Right 15 Ctrl 25 Alt
The last few chunks are demo files and occasionally miscellaneous data. Demo file formats are currently unknown but can be recorded using the demo cheat. They contain the level number and difficulty followed by a series of key commands.