SkyRoads compression
Jump to navigation
Jump to search
SkyRoads compression
Format type | Compression algorithm |
---|---|
Type | Stream |
I/O unit size | 1-bit |
Games |
SkyRoads compresses some of its data files with this algorithm, based on LZSS.
Header
The compressed data starts with a short header.
Data type | Name | Description |
---|---|---|
UINT8 | width1 | Number of bits in the LZSS "count" fields |
UINT8 | width2 | Number of bits in the short LZSS "dist" fields |
UINT8 | width3 | Number of extra bits in the long LZSS "dist" fields |
The compressed data follows. It is in big-endian order, meaning the first bit read comes from a byte's MSB (most significant bit), while each byte's LSB is the last bit read before moving on to the next byte.
Algorithm
- Read the three header bytes
-
Read one bit
-
If it's 0:
- Read width2 bits, add 2 to the value, and use it as the LZSS dist value
- Read width1 bits, add 2 to the value, and use it as the LZSS count value
- Look back dist bytes in the output data, and copy count bytes to the end of the output data
-
Otherwise: (it's a 1)
- Read another single bit
-
If this bit is 0:
- Read width3 bits, add (1 << width2) + 2 to the value, and use it as the LZSS dist value
- Read width1 bits, add 2 to the value, and use it as the LZSS count value
- Look back dist bytes in the output data, and copy count bytes to the end of the output data
-
Otherwise: (it's a 1 again)
- Read 8 bits, and write them as a byte on the end of the output data
-
If it's 0:
- If there is more input data, go back to step 2
Pseudocode
width1 = getbyte()
width2 = getbyte()
width3 = getbyte()
index = 0
while (index < length) {
if (getbits(1) == 0) {
dist = 2 + getbits(width2)
count = 2 + getbits(width1)
copy count bytes from index - dist to index, add count to index
}
else if(getbits(1) == 0) {
dist = 2 + (1 << width2) getbits(width3)
count = 2 + getbits(width1)
copy count bytes from index - dist to index, add count to index
}
else copy 8 bits from input to index, add 1 to index
}
MATLAB Source Code
Compression
%LZSS compressor
%compresses array of bytes using modified LZSS scheme
function cbytes = compressLZSS(bytes)
%define compression parameters
crossover = 2;
width1 = 5;
width2 = 8;
width3 = 10;
%calculate ranges of dist and count values
maxsmdist = bitshift(1, width2) + crossover - 1;
maxdist = bitshift(1, width3) + maxsmdist;
minlen = crossover;
maxlen = bitshift(1, width1) + crossover - 1;
%open temp file for writing
fid = fopen('temp.lzs', 'w+');
%write compression parameters
fwrite(fid, width1, 'ubit8');
fwrite(fid, width2, 'ubit8');
fwrite(fid, width3, 'ubit8');
%loop through input bytes
i = 1;
while i <= length(bytes)
match = 0;
if i > crossover
%generate look-ahead buffer
lookahead = bytes(i:min(i+maxlen-1, end));
%generate scan buffer
scan = bytes(max(i-maxdist+1, 1):i-1);
%loop through all possible lengths, match longest first
for len = min(maxlen, min(length(scan), length(lookahead))):-1:minlen
%loop through scan buffer looking for a match
complook = lookahead(1:min(len, end));
for dist = len:min(maxdist, length(scan))
%compare bytes "dist" back, length "len" to lookahead
compscan = scan(end-dist+1:end-dist+len);
if complook == compscan
%check if the distance can be coded with the small bit
if dist <= maxsmdist
%write signal 0 bit, followed by width2 bits of dist and width1 bits of len
fwrite(fid, 0, 'ubit1');
fwrite(fid, bitsbe(dist-crossover, width2), 'ubit1');
fwrite(fid, bitsbe(len-crossover, width1), 'ubit1');
% fprintf('Copy from small -%d, %d bytes\n', dist, len);
else
%write signal 10 bits, followed by width3 bits of dist and width1 bits of len
fwrite(fid, [1 0], 'ubit1');
fwrite(fid, bitsbe(dist-bitshift(1, width2)-crossover, width3), 'ubit1');
fwrite(fid, bitsbe(len-crossover, width1), 'ubit1');
% fprintf('Copy from large -%d, %d bytes\n', dist, len);
end
match = 1;
break
end
end
if match
break
end
end
end
%write the byte to the output if no match was made
if ~match
%write signal 11 bits, followed by byte of data
fwrite(fid, [1 1], 'ubit1');
fwrite(fid, bitsbe(bytes(i), 8), 'ubit1');
% fprintf('Write byte %d: %d\n', i, bytes(i));
i = i+1;
else
%don't write the next len # of bytes by incrementing loop variable
i = i+len;
end
end
%read back the file contents and store in cbytes
% frewind(fid);
% bits = fread(fid, inf, 'ubit1');
% fprintf('%d', (bits(1:end).'))
% fprintf('\n')
frewind(fid);
cbytes = fread(fid, inf, 'ubit8');
%flip the bits in the compressed section
cbytes(4:end) = bin2dec(fliplr(dec2bin(cbytes(4:end))));
%close temp file and delete
fclose(fid);
delete('temp.lzs');
end
%convert unsigned decimal value to BE bits
function bits = bitsbe(byte, n)
bits = bitget(byte, n:-1:1);
end
Decompression
%decompress LZSS bytes knowing resulting bytes should be length N
function dbytes = decompressLZSS(bytes, N)
%flip the bit order within the compressed section
bytes = [bytes(1:3); bin2dec(fliplr(dec2bin(bytes(4:end))))];
%save bits to temp file
ftemp = fopen('temp.lzs', 'w+');
fwrite(ftemp, bytes, 'ubit8');
frewind(ftemp)
fout = fopen('out.lzs', 'w+');
% bits = fread(ftemp, inf, 'ubit1');
% fprintf('%d', (bits(1:end).'))
% fprintf('\n')
% frewind(ftemp);
%read compression header bytes
width1 = fread(ftemp, 1, 'uint8');
width2 = fread(ftemp, 1, 'uint8');
width3 = fread(ftemp, 1, 'uint8');
%step through compressed bits and decompress
%stop once N bits have been written to output
while ftell(fout) < N
b = fread(ftemp, 1, 'ubit1');
if ~b
%read width2 bits, add 2 -> dist
dist = uintbe(fread(ftemp, width2, 'ubit1'))+2;
%read width1 bits, add 2 -> count
count = uintbe(fread(ftemp, width1, 'ubit1'))+2;
%rewind dist bytes in output
fseek(fout, -dist, 'eof');
%copy count bytes from output to end
copy = fread(fout, count, 'ubit8');
fseek(fout, 0, 'eof');
fwrite(fout, copy, 'ubit8');
% fprintf('Copy from small -%d, %d bytes\n', dist, count);
else
b = fread(ftemp, 1, 'ubit1');
if ~b
%read width3 bits, add 2+1<<width2 -> dist
dist = uintbe(fread(ftemp, width3, 'ubit1'))+bitshift(1, width2)+2;
%read width1 bits, add 2 -> count
count = uintbe(fread(ftemp, width1, 'ubit1'))+2;
%rewind dist bytes in output
fseek(fout, -dist, 'eof');
%copy count bytes from output to end
copy = fread(fout, count, 'ubit8');
fseek(fout, 0, 'eof');
fwrite(fout, copy, 'ubit8');
% fprintf('Copy from large -%d, %d bytes\n', dist, count);
else
%read a byte and copy to end of output
copy = uintbe(fread(ftemp, 8, 'ubit1'));
fseek(fout, 0, 'eof');
fwrite(fout, copy, 'ubit8');
% fprintf('Read byte %d: %d\n', ftell(fout)-1, copy);
end
end
end
%rewind output and read it to get all bytes
frewind(fout);
dbytes = fread(fout, inf, 'uint8');
%close files and delete temps
fclose(ftemp);
fclose(fout);
delete('temp.lzs');
delete('out.lzs');
end
%convert BE bits to unsigned int
function i = uintbe(bits)
i = (2.^(length(bits)-1:-1:0))*bits(:);
end
Credits
This compression algorithm was reverse engineered by Tahg on the Xentax forum. The MATLAB code was submitted by skaterzero807. If you find this information helpful in a project you're working on, please give credit where credit is due. (A link back to this wiki would be nice too!)