Westwood Unicode BitFont Format

From ModdingWiki
Jump to navigation Jump to search
Westwood Unicode BitFont Format
There is no image of a font in this format — upload one!
Format typeFont
Max glyph count65,535
Minimum glyph size (pixels)0×0
Maximum glyph size (pixels)255×?
Access modeIndexed
Metadata?None
Bitmap glyphs?Yes
Vector glyphs?No
Compressed glyphs?No
Hidden data?Yes
Games

Command & Conquer: Red Alert 2 uses the Westwood Unicode BitFont Format, making it the first Command & Conquer game with support for the basic unicode range up to 65,536 symbols. This font type is rather different from the earlier WWFont, going back to 1 bit per pixel, and using a completely different header which seems to be based on that of the earlier non-unicode BitFont format.

The game has only one single font file in this format, namely, game.fnt. While the font's symbols technically represent the unicode range, its 0 to 255 range actually corresponds to Windows-1252 encoding, whereas in normal unicode, this range should be ISO-8859-1. This means the range 128 (0x80) to 160 (0xA0) are used as actual symbols, while in ISO-8859-1 this range only contains control codes. However, the Euro symbol (€) in game.fnt is only at its original position at index 8364 (0x20AC), and not at the 128 (0x80) position where it would be expected in the Windows-1252 encoding.

File format

Header

The font format starts with the following header.

Offset Data type Name Description
0x00 Char[4] Format Literal string "fonT".
0x04 UINT32LE IdeographWidth Width override for the standard ideographic space (symbol 0x3000). The logic that uses this is rather convoluted, though; it only activates on the Japanese language (id '6' internally), and if the space character (0x32) is not empty, it takes the double of the width of that instead.
0x08 UINT32LE Stride Stride of all font image data.
0x0C UINT32LE Lines Amount of pixel lines stored for each symbol.
0x10 UINT32LE FontHeight Actual font height, in pixels. This can add a vertical padding to the lines.
0x14 UINT32LE Count Amount of saved symbol images in the file. Images can be used for any unicode symbol, and can even be used multiple times.
0x18 UINT32LE SymbolDataSize Size of one symbol data block. Should always match 1 + Stride*Lines.
0x1C UINT16LE[0x10000] UnicodeTable Array representing the full 65,536 symbol unicode range, referencing which stored symbol block to use for each symbol. Values in this table should be subtracted by 1 to get the index, with an index of -1 (after subtraction) meaning 'empty'.

Image data

The actual font data is very simple straightforward 1 bit per pixel, using the stride and height indicated in the header. The data is prefixed by a byte that gives the symbol width.

Since all symbol image blocks are the same size, the block of data for a specific unicode code point CodePoint can be found by getting the image index from index = UnicodeTable[CodePoint] - 1. If index is -1, the symbol is empty. Otherwise, the data block for that symbol can be found at the address 2001C + SymbolDataSize*index.

Offset Data type Name Description
0x00 UINT8 SymbolWidth Width of this symbol. This does not change the stride of the following data, but only limits the final image width to show.
0x1 BYTE[Stride*Lines] SymbolData 1-bpp image block with the stride and height given in the header.

The font does not contain any padding between symbols. Padding is applied automatically by the game, and seems to be one pixel between the symbols.

Technically, data could be hidden in the SymbolData, in the portion of the full stride that is cut off by the local width value. In fact, the original font contains a symbol with its SymbolWidth set to 0, showing as empty despite containing data. However, its data is completely empty.

Ideograph ranges

When looking into the data of game.fnt, it should be noted that the ideographic space symbol at index 0x3000 is not saved as an empty entry, but as an actual symbol containing data, but with its width set to 0. This is because on Japanese games, the game modifies the width in the image data of this symbol after reading the data, meaning the data must be long enough to accommodate the use of the ideographic space width override (either twice the space, or the IdeographWidth from the header). Because of this quirk, all data in the original font is saved with a stride of three bytes, despite the maximum width of the existing symbols fitting in just two bytes.

In the symbols behind the ideographic space, a lot of symbols refer to the same image data as the ideographic space, which makes them appear as empty when viewed in the font data, but which would make their width default to that same modified width in the Japanese game. This is most likely intentional, to give a fixed default width to any "empty" symbols in that range. Any editors should be aware of this quirk, to save these symbols correctly. Symbol 0x3000 itself is probably best always forced to become an existing but empty symbol block in the data.

The exact ranges in which this happens are 0x3000 to 0x301F, and 0x3303 to 0x33CD. While the first of these is a very distinct set of symbols, it is unclear why the second range excludes certain parts at the start and end. Looking at the 3300-33FF symbols, nothing about the 3303-33CD sub-range seems to stick out.

Tools

The following tools are able to work with files in this format.

Name PlatformView images in this format? Convert/export to another file/format? Import from another file/format? Access hidden data? Edit metadata? Notes
OS Font Editor WindowsYesYesYesNoN/A Can import TrueType fonts. Has issues saving more than 32767 distinct symbols into the font due to handling the data as signed 16 bit. Can't change the font dimensions.
Westwood Font Editor WindowsYesYesYesNoN/A Re-optimises the entire font on save, resulting in slightly longer (±5 second) save times.