PCX Format

From ModdingWiki
Jump to navigation Jump to search
PCX Format
PCX Format.png
Format typeImage
HardwareCGA, EGA, VGA
Colour depthMultiple
Minimum size (pixels)0×0
Maximum size (pixels)65536×65536
PaletteInternal (optional)
Plane count1-255
Transparent pixels?No
Hitmap pixels?No
Games

The PCX Format is an image format used by many games, usually to store full screen (320x200) 16-colour EGA, and later 256-colour VGA (mode 13h), graphics. It was, for a time, also a general picture format like .bmp or .png, and was the primary format used by PC Paintbrush. It declined in popularity after support for 24-bit true colour images was added too late, by which time many people had switched to competing formats like .png and JPEG (the latter offering far better compression for photos.) It also lacks support for transparency, which resulted in it losing some ground to the GIF format which otherwise provided a similar feature set.

Header

The PCX file is composed of two parts, the header and the image data, which is usually compressed. The header is as follows:

Data type Name Description
UINT8 Manufacturer Always 0x0A
UINT8 Version PC Paintbrush version. Acts as file format version.
0 = v2.5
2 = v2.8 with palette
3 = v2.8 without palette
4 = Paintbrush for Windows
5 = v3.0 or higher
UINT8 Encoding Should be 0x01
0 = uncompressed image (not officially allowed, but some software supports it)
1 = PCX run length encoding
UINT8 BitsPerPlane Number of bits per pixel in each entry of the colour planes (1, 2, 4, 8, 24)
UINT16LE WindowXmin Window (image dimensions):
ImageWidth = Xmax - Xmin + 1
ImageHeight = Ymax - Ymin + 1
Normally Xmin and Ymin should be set to zero. Note that these field values are valid rows and columns, which is why you have to add one to get the actual dimension (so a 200 pixel high image would have Ymin=0 and Ymax=199, or Ymin=100 and Ymax=299, etc.)
UINT16LE WindowYmin
UINT16LE WindowXmax
UINT16LE WindowYmax
UINT16LE VertDPI This is supposed to specify the image's vertical and horizontal resolution in DPI (dots per inch), but it is rarely reliable. It often contains the image dimensions, or nothing at all.
UINT16LE HorzDPI
UINT8 Palette[48] Palette for 16 colors or less, in three-byte RGB entries. Padded with 0x00 to 48 bytes in total length. See below for more details on palette handling.
UINT8 Reserved Should be set to 0, but can sometimes contain junk.
UINT8 ColorPlanes Number of colour planes. Multiply by BitsPerPlane to get the actual colour depth.
UINT16LE BytesPerPlaneLine Number of bytes to read for a single plane's scanline, i.e. at least ImageWidth ÷ 8 bits per byte × BitsPerPlane. Must be an even number. Do not calculate from Xmax-Xmin. Normally a multiple of the machine's native word length (2 or 4)
UINT16LE PaletteInfo How to interpret palette:
1 = Color/BW
2 = Grayscale (ignored in PC Paintbrush IV and later)
UINT16LE HorScrSize Only supported by PC Paintbrush IV or higher; deals with scrolling. Best to just ignore it.
UINT16LE VerScrSize
BYTE[54] Padding Filler to bring header up to 128 bytes total. Can contain junk.

Image Data

Image data comes after the header (starting at offset 0x80 in a PCX file), and will be RLE compressed if the header indicated so. The way the data is stored depends on how many colour planes are specified. Each row has its color planes stored sequentially, similar to raw EGA data.

For one plane of eight bits (256-colour), each byte will represent one pixel. For one plane of four bits (16-colour), each byte will represent two pixels. The bits within the byte are in big-endian order, so the most significant bit belongs to the left-most pixel. In other words, in two bits per pixel mode, a byte of value 0xE4 (binary 11 10 01 00) will have left-to-right pixel values of 3, 2, 1, 0.

EGA 16-colour images are often stored with four colour planes instead of one, with each plane being one-bit-per-pixel (think of four black and white images, one each for red, green, blue and intensity.) The planes are stored sequentially for each line (see Row-planar EGA data for the exact details), thus a 320x200 EGA image will store at least 40 bytes for each scanline's colour plane (320 pixels ÷ 8 bits per byte × 1 bit per pixel), with each scanline being at least 160 bytes long (320 pixels ÷ 8 bits per byte × 1 bit per pixel × 4 planes). Note that the scanline length can be larger than expected (40 bytes in this example), especially for images whose width is not a multiple of four. This is because each scanline in a plane is padded to a multiple of two or four bytes, depending on the architecture of the machine used to create the file. The actual size is stored in the BytesPerPlaneLine field in the header, which should always be used instead of calculating the value from the other image attributes.

True colour PCX files are not common, and could be either three planes (R, G and B) of eight bits each (24-bit RGB) or one plane of 24-bits. Technically, the same is applicable for alpha-capable 32-bit images (with four planes in planar mode), though no official format specs ever included such a thing.

The split into planes is generally governed by what is most convenient for the game at the time, which in turn depends on which video mode is being used to display the image. Since EGA video memory is split into planes, 16-colour PCX files are frequently split into matching planes so that no processing is required when loading an image directly into video memory.

RLE Compression

The PCX format uses a form of RLE Compression that is rather unique. It compresses on the byte level and uses a flag system, where the flag is the two highest bits of a byte. If this flag is set (i.e. the two upper bits are set, or in other words the value is >= 192) then the lower six bits are the number of times to repeat the following byte.

Thus, for a byte pair C7 28, the 0xC7 can be decomposed into the flag value 192 (128 + 64) plus 7, meaning the full byte pair means '7 bytes of 0x28'. So to get the amount, you subtract 192 from the flag, or, using faster logic operations, you can do byte & 0x3F to retain only the lowest six bits.

This means that the six-bit length values have a maximum of 63. It also means that any value larger than 191 must be stored as a length/value pair, which can actually increase the size of the file in some cases. For instance, if you have a single byte of color 192, then it must be represented by two bytes - one of 193 (C1, a repeat of one time) followed by one of 192 (C0, color byte 192).

It is also worth noting that the byte value C0 does not have a clearly defined effect. Depending on the implementation in the decoding program, this could do any of the following:

  • treat C0 as a literal byte.
  • ignore C0 and continue with the following byte.
  • 'repeat the following byte zero times', effectively ignoring any byte following C0. This could conceivably be used to embed non-image data in the PCX file which would be ignored by any program displaying the image.
  • 'repeat the following byte 65536 times', which is basically a bugged implementation using a "while (--count != 0)" style loop with a 16 bit variable/CPU register.

At any rate, the best way to handle a C0 when encoding (compressing) is to write the sequence C1 C0. When decoding (decompressing), a value of C0 almost always indicates an error in the file.

Note that each scanline is compressed independently - an RLE sequence may span multiple planes, but it will never span more than one row of pixels. Thus when decompressing an image, the RLE algorithm will produce at most BytesPerPlaneLine bytes at a time. Even where the RLE coding could have continued over to the next scanline, it will stop and start again fresh for each line.[1] For example, if the input image is 8×4 pixels EGA 16-colour, and the first two lines of pixels are black (color 0) and the last two are white (color 15), they must be compressed as C4 00 | C4 00 | C4 FF | C4 FF and not as C8 00 | C8 FF.

Palettes

Most of the game's supported formats are indexed, and thus require a colour palette. There are, however, three distinct ways of handling those palettes; CGA, EGA and VGA. For images with 16 colours or less, the palette is stored in the header. For images with more colours (i.e. 256-colour images) the header palette is ignored, and the 768-byte palette is stored after the image data. Some sources seem to indicate that there are formats where even the smaller palettes are added behind the image, so it is best to always check for it there. To determine if this is the case, check if the decompressed image data is followed by a 0C byte, and if there is enough data left in the file behind that for the palette data.

As a general rule, the palettes are lists of 3-byte blocks. These blocks normally contain 8-bit RGB data, but even if they are used differently (as is the case for CGA palettes), they will still retain the 3-byte block structure.

It is difficult to correctly determine the palette for 1-bit and 2-bit images, since they can be stored either as CGA or as EGA, and there is no real way to identify which method to use. Some sources suggest checking for 640x200 dimensions on 1-bit images, or 320x200 on 2-bit, since those are standard CGA sizes for respectively its monochrome and 4-colour mode, but 320x200 is a very common size for non-CGA images as well, and there has most likely never been anything preventing people on CGA hardware from saving PCX images in different sizes than a full screen. It is also suggested, for two bit per pixel images, that those with a single 2-bit plane would use a CGA palette, while those with two 1-bit planes would use EGA[2]. All of this makes fully automatic identification for these image types very difficult.

CGA palette

CGA palette handling is a bit peculiar. As with other colour palettes, the data is seen as blocks of three bytes. However, in CGA mode, these blocks do not contain normal RGB colour data.

The first triplet contains a specifically-defined colour. Only the highest four bits of the first byte are used, meaning the actual colour value can be found by taking that byte and shifting it four bits to the right. The result is a value from 0..15, which matches one of the standard CGA/EGA text-mode colours.

For monochrome CGA images, this colour is put on index 1 on the palette, and index 0 is filled in with black (EGA palette entry #0).

For actual 4-colour palettes, this colour becomes the background colour at index 0. The other colours are determined by three status bits fed into the CGA hardware; ColorBurst, Palette and Intensity. For the full explanation on which colours this produces, see the CGA Palette article.

However, the way these three bits are determined changed in PC Paintbrush version IV, and this change is not reflected in a change in the Version value in the header. The only decent way to distinguish the files from before and after version IV is to check the PaletteInfo byte in the header, which is 0 on files using the old method, and filled in (with 1 or 2) on files using the new method.

Older method

In the original way the status bits were saved, the second palette entry's first byte (byte 3) contains the three status bits.[3] They can be extracted by taking the highest three bits of that byte.

  • Bit 8 (10000000) is ColorBurst.
  • Bit 7 (01000000) is the Palette.
  • Bit 6 (00100000) is the Intensity.
Newer method

For the newer files, with the PaletteInfo byte filled in, the following[4][5] should be done to determine the palette:

  • Take the green and blue values of the second palette entry. These are the bytes at index 4 and 5 in the palette data.
  • If the green value is strictly-greater than the blue value, take Palette 0, otherwise take Palette 1.
  • If the largest of these two values is greater than 200, the Intensity bit is enabled.

The new method seemed to have been intended to convert a fully-saved colour palette back to CGA status bits. Sadly, it seems people figured out exactly what it checked, and started writing PCX files which contained only exactly enough data to derive those bits instead of saving an actual full palette.

As a side effect of this, there is no real check for the color burst, meaning it is always considered to be enabled, and there is no support for palette 2. If an actual palette is saved in the file, such support could be added. A viable check for detecting palette 2 (meaning, a disabled colour burst bit) would be that the original logic matches palette 1, and the third entry's red value is greater than its blue value.

EGA palette

EGA images are those with eight or sixteen colours. Typically, they have BitsPerPlane set to 1 and ColorPlanes set to 3 or 4, though variations like two 2-bit planes or one 4-bit plane are possible as well. For these images, there are two ways of handling the colours:

  • If the version in the header is 0 or 3, then the standard EGA palette is used, since version 0 does not support a modified palette, and version 3 specifically indicates that no palette information is present in the file.[3]
  • In any other version, the palette is read from the header. The data is structured the same way as 8-bit VGA palettes, except that it's less long.

Most EGA images will be 4 bits per pixel, meaning the palette will be 16 colours long, but 3 bit per pixel planar format is supported as well. In that case, only 8 entries are read, or if it uses the default EGA palette, only the darker first eight colours will be available. 2 bit per pixel images could use EGA palette handling as well, but these are hard to distinguish from the CGA ones.

Note that technically, EGA images are limited to 2 bit colour components, meaning every component would need to be rounded to the nearest multiple of 0x55, though, practically, there is little reason to not let the images retain the more accurate colours as they are specified in the palette.

VGA palette

For 8-bit images, the 256-colour palette can be found behind the image data. It is a standard VGA palette in 8-bit RGB format. A single signature byte of 0C is included before the palette data begins.

Some PCX readers simplify the start offset of the 256-colour palette to EndOfFile - 768, though technically, the correct way to find it is to take the offset you end up at after decompressing the image, checking for the 0C byte after that, and then reading the 768-byte palette.

Source Code

ASM

EXTERN  kbdin, dosxit           ; LIB291 functions

SEGMENT ScratchSeg
ScratchPad      resb 65535

SEGMENT stkseg STACK
        resb    64*8
stacktop:
        resb    0

SEGMENT code

PCX1    db      'my_pcx1.pcx', 0        ; Filenames
PCX2    db      'my_pcx2.pcx', 0        ; (Must end with 0 byte)

..start:
        mov     ax, cs          ; Set up data and stack segments
        mov     ds, ax
        mov     ax, stkseg
        mov     ss, ax
        mov     sp, stacktop

MAIN:
        ; Sets up mode 13h and clears screen
        mov     ax, 0013h
        int     10h

        mov     dx, pcx1        ; Filename to display
        call    ShowPCX         ; Display PCX file to screen

        ; Wait for keypress
        call    kbdin

        ; Go back to text mode
        mov     ax, 0003h
        int     10h

        ; Return to DOS
        call    dosxit

;-----------------------------------------------------------------------------
; ShowPCX procedure by Brandon Long,
;   modified by Eric Meidel and Nathan Jachimiec,
;   converted to NASM, cleaned up, and better commented by Peter Johnson
; Inputs: DX has the offset of PCX filename to show.
; Output: PCX file displayed (all registers unchanged)
; Notes:  Assumes PCX file is 320x200x256.
;         Uses ScratchSeg for temporary storage.
;         The PCX file must be in the same directory as this executable.
;-----------------------------------------------------------------------------
ShowPCX
        push    ax              ; Save registers
        push    bx
        push    cx
        push    si
        push    di
        push    es

        mov     ax, 3D00h
        int     21h             ; Open file
        jc      .error          ; Exit if open failed

        mov     bx, ax          ; File handle
        mov     cx, 65535       ; Number of bytes to read
        mov     ax, ScratchSeg  ; DS:DX -> buffer for data
        mov     ds, ax
        mov     dx, ScratchPad
        mov     si, dx
        mov     ah, 3Fh
        int     21h             ; Read from file

        mov     ax, 0A000h      ; Start writing to upper-left corner
        mov     es, ax          ; of graphics display
        xor     di, di

        add     si, 128         ; Skip header information

        xor     ch, ch          ; Clear high part of CX for string copies

.nextbyte:
        mov     cl, [si]        ; Get next byte
        cmp     cl, 0C0h        ; Is it a length byte?
        jb      .normal         ;  No, just copy it
        and     cl, 3Fh         ; Strip upper two bits from length byte
        inc     si              ; Advance to next byte - color byte
        lodsb                   ; Get color byte into AL from [SI]
        rep stosb               ; Store to [ES:DI] and inc DI, CX times
        jmp     short .tst

.normal:
        movsb                   ; Copy color value from [SI] to [ES:DI]

.tst:
        cmp     di, 320*200     ; End of file? (written 320x200 bytes)
        jb      .nextbyte

        mov     cl, [si]
        cmp     cl, 0Ch         ; Palette available?
        jne     .close

        ; Set palette using port I/O
        mov     dx, 3C8h
        mov     al, 0
        out     dx, al
        inc     dx              ; Port 3C9h
        mov     cx, 256*3       ; Copy 256 entries, 3 bytes (RGB) apiece
        inc     si              ; Skip past padding byte

.palette:
        lodsb
        shr     al, 1           ; PCX stores color values as 0-255
        shr     al, 1           ;  but VGA DAC is only 0-63
        out     dx, al
        dec     cx
        jnz     .palette

.close:
        mov     ah, 3Eh
        int     21h             ; Close file

.error:
        pop     es              ; Restore registers
        pop     di
        pop     si
        pop     cx
        pop     bx
        pop     ax
        ret

Tools

PCX files can be read, and occasionally converted by several programs, notably, the Microsoft Photo Editor included in Windows XP can do so.

The following tools are able to work with files in this format.

Name PlatformView images in this format? Convert/export to another file/format? Import from another file/format? Access hidden data? Edit metadata? Notes
XnView Windows/MacOSX/LinuxYesYesYesN/AN/A Freeware for private non-commercial or educational use.
ImageMagick Cross-platformYesYesYesN/AN/A Is unable to correctly write 16-colour PCX files.
GNU Image Manipulation Program Cross-platformYesYesYesN/AN/A Does not handle CGA or default-palette EGA images correctly.
Engie File Converter WindowsYesYesNoN/AN/A Relies on common image dimensions for CGA detection. No save functionality so far.

Useful links

References

  1. PCX Graphics | Dr Dobb's
  2. PCX graphics files explained - section "Interpretation of the PCX data"
  3. 3.0 3.1 PCX specs on fileformat.info
  4. PCX image encoder and decoder for Go - PC Paintbush 4.0 encodes the CGA palettes differently than 3.0.
  5. Mark Tyler's Painting Program - CGA palette is evil: what the PCX spec describes is the way it was handled by PC Paintbrush 3.0, while 4.0 was using an entirely different, undocumented encoding for palette selection.