Westwood SHP Format (TD)
The sprite format used in Command & Conquer (also known as "Tiberian Dawn"), Red Alert and Sole Survivor is a collection of compressed 8-bit frames that all have the same dimensions. It can use two different compression algorithms, namely LCW and XOR Delta, and normally uses the most optimal one for each frame, though the use of XOR Delta is optional. The format technically supports an internal colour palette and an X and Y offset for the frames, but these options are ignored by the games that use it.
|0x00||UINT16LE||Frames||Number of frames in the file.|
|0x02||UINT16LE||XPos||X-offset. Should be ignored.|
|0x04||UINT16LE||YPos||Y-offset. Should be ignored.|
|0x06||UINT16LE||Width||Width of the frames.|
|0x08||UINT16LE||Height||Height of the frames.|
|0x0A||UINT16LE||DeltaSize||Largest buffer size needed to decompress the frames.|
|0x0C||BYTE||Flags||Extra options. Bit 1 of this technically enables an embedded colour palette, but it is unused in the games.|
The XPos and YPos are often set in the headers of original game files as a side effect of Westwood's conversion process, but they are not applied by the games, and should be ignored. An embedded colour palette will most likely also be ignored by the games, since SHP files are typically small sprites drawn on a scene that already has a palette set.
Frames info table
After the header comes an array of size
Frames + 2, with 8-byte entries that each have the following structure:
|0x00||UINT24LE||DataOffset||A three-byte integer value giving the offset to the frame's compressed data. Since both the LCW and XOR Delta compressions end their data with specific end markers, no end offset is needed.|
|0x03||BYTE||DataFormat||The compression format in which the data at DataOffset is stored. See below.|
|0x04||UINT24LE||ReferenceOffset||Contains the referenced DataOffset or frame number in case XOR chaining is used. See below.|
|0x07||BYTE||ReferenceFormat||Reference format in case XOR chaining is used. See below.|
Technically, these are two UINT32LE values with bit flags enabled in their highest byte. This might also be the simplest way to read them; by reading a UINT32LE, bit-masking the value with 0xFFFFFF to get the offset, and down-shifting the value by 24 bits to get the format.
It is unknown if the ReferenceFormat is used in any way; the way to handle the data seems to be completely defined by the DataFormat. It can of course be used as consistency check to confirm that the format is valid, and when writing a SHP file, it's better to conform to the known standards to ensure the game handles the file correctly.
If the palette flag is enabled, this table is followed by a 768-byte array containing a 256-colour 6-bit RGB VGA palette. After that follows the actual data referenced at the DataOffset addresses in the table. Unlike in WSA format, the palette is taken into account in the offsets, so no adjustments are needed on them.
There are three possible ways in which a frame can be stored, depending on the DataFormat. This is how the table entries look for each of these ways:
The most straightforward frame type is an LCW-compressed frame.
- DataOffset: points to the LCW data to uncompress to get the frame graphics.
- DataFormat: set to 0x80, indicating LCW.
- ReferenceOffset: irrelevant, and left empty.
- ReferenceFormat: irrelevant, and left empty.
XOR Base Frame
In case the differences with a previously-saved LCW frame are minimal, XOR Delta compression is used to save instructions for transforming the previous frame's data into the new frame.
- DataOffset: points to the XOR data to apply.
- DataFormat: set to 0x40, indicating XOR Base.
- ReferenceOffset: Contains the DataOffset of the referenced LCW frame.
- ReferenceFormat: set to 0x80 (LCW). It is unknown if (and unlikely that) any of the other formats are supported as reference.
XOR Chain Frame
The third case, chained XOR, is a bit peculiar: it is an XOR with the immediately preceding XOR frame. It can only chain from either an XOR Base frame, or another XOR Chain frame.
- DataOffset: points to the XOR data to apply.
- DataFormat: set to 0x20, indicating XOR Chain.
- ReferenceOffset: refers to the frame number of the XOR Base frame at the start of the chain.
- ReferenceFormat: set to 0x48, indicating XOR Chain Reference.
Two final entries
As mentioned, the table has two more entries than the amount of frames. The first of these two extra entries normally serves as end point; its DataOffset contains the file length, and all its other values are set to 0. The final entry is normally completely zeroed out.
However, some games contain SHP files where the last entry contains the file size. In that case, the entry before that contains the information for a 'loop frame'. Loop frames are normally not used in SHP format; they were conceived for the WSA format, which, being purely based on XOR Delta compression, needed an extra XOR data entry as a smooth way to transform the final frame into the first one without needing to clear the graphics buffer. If such a loop frame exists in the SHP file, it should be ignored, and the file size should simply be taken from the last entry. It will most likely contain a duplicate of the first frame.
The basic implementation is simple: compress the first frame as LCW, and for every following frame, attempt LCW, XOR Delta with a previous LCW frame, and (if the previous one is an XOR Delta) XOR chaining, and store the one which results in the lowest compressed size.
The existing files seem to work with a system of LCW "key frames"; XOR Base frames never refer to LCW frames before the previously-saved LCW frame. Note that after an XOR chain, it is perfectly possible to have another XOR Base frame referring to this last key frame, and multiple XOR chains can occur after a single key frame.
There also seems to be a rule applied to limit the length of XOR chains. Since the games do not sequentially preload SHP files, but instead interpret the data at the moment it is drawn to the screen, long chains can slow down the drawing process, since it needs to read, interpret and alter all chained frames in sequence. For example, the SAM Site SHP file in C&C1, when saved without chain limiting, would result in chains longer than 50 frames.
To limit this, the used strategy appears to be to only allow chaining if the cumulative size of all chained frames after the original XOR frame does not exceed the size of that original XOR frame.
This is just the saving strategy used in the original files, though; technically, the XOR Base frames can refer to any previous LCW frame in the file, and there is no length limit to XOR chains. The strategy seems to be a compromise between compression time, saved size and decompression time.
The following tools are able to work with files in this format.
|Name||Platform||View images in this format?||Convert/export to another file/format?||Import from another file/format?||Access hidden data?||Edit metadata?||Notes|
|Engie File Converter||Windows||Yes||Yes||Yes||No||N/A||Uses the original algorithms and storage principles used by Westwood Studios, with further optimisations to avoid saving data of duplicate frames.|
|XCC Mixer||Windows||Yes||Yes||Yes||No||N/A||The de facto standard modding tool in the C&C community, but like most older tools, its LCW compression is not very good, and it does not use XOR Delta when saving this type.|
|Mix Manager||DOS||Yes||Yes||Yes||No||N/A||The most commonly used modding suite back in the DOS days.|
|RAMIX||DOS||Yes||Yes||Yes||No||N/A||The spiritual successor of Mix Manager, created when people started digging into Command & Conquer Red Alert.|
|Red Horizon Utilities||Java (command line)||No||Yes||Yes||No||N/A||Original site is defunct. Backups of the tools can be found here.|
|OpenRA Utility||Windows, Linux, Mac||No||Yes||Yes||No||N/A||Command line tool that can export/import from png. Does not use XOR Delta, and its LCW encoder only saves using basic copy/repeat commands.|