Mohawk Bitmaps

From A look inside The Link @ wiki
Jump to: navigation, search
Myst
Mohawk Overview
CLRC EXIT HINT INIT
MJMP MSND PICT RLST
VIEW WDIB HELP RSFL
Scripts Variables
Riven
Mohawk Overview
BLST CARD FLST HSPT
MLST NAME PLST RMAP
SFXE SLST tBMP tMOV
tWAV VARS VERS ZIPS
Scripts Variables
External commands

All observed Mohawk games use a common bitmap format, besides Myst (which uses WDIB and PICT images). Mohawk bitmaps are always stored with a tBMP tag. All values are in big endian order. tBMP resources are divided into three chunks: the header, the palette, and the image. Note that the palette is not always present (such as non-Riven 8bpp images and Riven 24bpp images).

Bitmap Header

unsigned short bitmap width
unsigned short bitmap height
unsigned short bytes per row
unsigned short compression details
  • bitmap width is the width of the bitmap in pixels.
  • bitmap height is the height of the bitmap in pixels.
  • bytes per row is the pitch of the image (bytes in a scan line).
  • For compression details, see the next section.

Note: Both bitmap width and bitmap height are only valid for the bottom 10 bits (val & 0x3ff, in C). bytes per row is also the same, but has to remain even (val & 0x3fe, in C).

Compression Details

compression details is split up into different sections, depending on their purpose.

Bits 0-2 (Bits Per Pixel)

The lower three bits represent the images' bit depth. It is chosen from this table.

Value Bits Per Pixel
0 1
1 4
2 8
3 16
4 24

Other values are undefined.

Bit 3 (Palette)

Bit 3 represents whether or not a palette will occur. However, this is very unreliable, as Riven never has this bit set. If the game is Riven, and the bit depth is 8, you should assume there is a palette.

Bits 4-7 (Secondary Compression)

Bits 4-7 represent the secondary compression on the image.

Value Secondary Compression
0 None
1 RLE8
3 Another (still unknown) RLE variant

For none, the image is raw pixels. However, there are bytes per row bytes per row, so you have to skip remaining bytes at the end of the row. In Riven, there is always no secondary compression. On the contrary, in almost all other games' images, the compression is RLE8.

Bits 8-11 (Primary Compression)

Bits 8-11 represent the primary compression on the image.

Value Secondary Compression
0 None
1 LZ
2 Another (still unknown) LZ variant
4 Riven

For none, there is nothing to do at this stage. See details later for the LZ and Riven compression schemes.

Palette

While bit 3 should represent whether a palette exists or not, all 8bpp Riven images have a palette regardless. The palette starts with a short header:

unsigned short table_size
byte bits_per_color
byte color_count
  • table_size is the size of the table, including the four byte header.
  • bits_per_color is the bits that each color is (seems to be always 24)
  • color_count is the amount of colors that the table contains (seems to be always 0xff)

Following the header is color_count blocks:

byte blue_component
byte green_component
byte red_component

Image

The rest of the tBMP resource is made up of the image. To decode the image, you must first use the primary decompression and then the secondary decompression to have the correct output image.

Primary Decompression

If the image has no primary decompression you can skip this step.

LZ Decompression

The stream consists of two parts, the header and the compressed data.

LZ Header
unsigned long uncompressed_size
unsigned long compressed_size
unsigned short dictionary_size
  • uncompressed_size is the size of the data after decompressing.
  • compressed_size is the size of the data before decompressing.
  • dictionary_size is the size of the ring buffer to use in the decompressor. However, it is always 0x400, and if it's not 0x400, it should be thrown out.
LZ Compression

Thanks to Petroff Heroj and Ron Hayter for working out this compression.

Until the end of the resource, each run begins with a byte. Each bit of this byte defines what to do next, starting from the least significant one.

  • A 1 means an absolute byte follows. Read a byte from the compressed data and store it directly into the uncompressed buffer.
  • A 0 means a length/offset pair follows. Read two bytes b1 and b2 from the compressed data. The most significant 6 bits of b1 represent the length of the run minus 3. The 2 least significant bits of b1 and the whole b2 form together a 10-bit offset into the ring buffer, minus 0x42. At this point copy length bytes from the ring buffer, starting at offset, to the uncompressed buffer. If offset is over 0x400 make sure to subtract 0x400 after adding the 0x42, i.e. loop around to the beginning of the ring buffer.

The ring-buffer should be initialized to all zeroes. Remember to store the uncompressed bytes in the ring buffer as well, looping to the beginning after 0x400 bytes.

For example:

0xf7                     // decoder byte (11110111b)
0x87                     // absolute byte
0x73                     // absolute byte
0x27                     // absolute byte
0x0b                     // byte 1 of the run data: length = 2 + 3 = 5 (first 6 bits + 3)
0xa9                     // byte 2 of the run data: offset = 0x3a9 + 0x42 = 0x3eb
0x27                     // absolute byte
0x32                     // absolute byte
0x00                     // absolute byte
0x4e                     // absolute byte

Riven Decompression

In compressed tBMP bitmaps, pixels are encoded as a data stream made of variable length commands. Pixels are always decoded in duplets: each command generates at least 2 pixels. The encoding is heavily based on what comes before each command, so even a little decoding bug can cripple the whole image. The commands can appear in any order inside the data stream. The first 4 bytes of the data stream are unknown and can be ignored. Many thanks to Arthur Muller for his precious help in decoding this format.

Like the uncompressed format, sometimes duplets are generated beyond the edge of the image. Use the bytes per row value to see how many duplets are in each row.

Main Commands

They are all 1-byte commands, followed by a variable number of arguments.

Command Action
0x00 End of stream: when reaching it, the decoding is complete. No additional bytes follow. I think some bitmaps don't have this, so just stop when you have decoded enough pixels to fill the image.
0x01 -0x3f Output n pixel duplets, where n is the command value itself. Pixel data comes immediately after the command as 2*n bytes representing direct indices in the 8-bit color table.
0x40-0x7f Repeat last 2 pixels n times, where n = command_value & 0x3F. No additional bytes follow.
0x80-0xbf Repeat last 4 pixels n times, where n = command_value & 0x3F. No additional bytes follow.
0xc0-0xff Begin of a subcommand stream. This is like the main command stream, but contains another set of commands which are somewhat more specific and a bit more complex. This command says that command_value & 0x3F subcommands will follow. It doesn't generate pixels itself.
Subcommands, part 1: arithmetic operations

Subcommands are not simply 1-byte values, but are somewhat mixed with their arguments, so the full byte pattern is reported.

Command Byte pattern Action
0x01-0x0f 0000mmmm Repeat duplet at relative position -m, where m is given in duplets. So if m=1, repeat the last duplet.
0x10 0x10 p Repeat last duplet, but change second pixel to p.
0x11-0x1f 0001mmmm Output the first pixel of last duplet, then pixel at relative position -m. m is given in pixels.
0x20-0x2f 0010xxxx Repeat last duplet, but add x to second pixel.
0x30-0x3f 0011xxxx Repeat last duplet, but subtract x to second pixel.
0x40 0x40 p Repeat last duplet, but change first pixel to p.
0x41-0x4f 0100mmmm Output pixel at relative position -m, then second pixel of last duplet.
0x50 0x50 p1 p2 Output two absolute pixel values, p1 and p2.
0x51-0x57 01010mmm p Output pixel at relative position -m, then absolute pixel value p.
0x59-0x5f 01011mmm p Output absolute pixel value p, then pixel at relative position -m.
0x60-0x6f 0110xxxx p Output absolute pixel value p, then (second pixel of last duplet) + x.
0x70-0x7f 0111xxxx p Output absolute pixel value p, then (second pixel of last duplet) - x.
0x80-0x8f 1000xxxx Repeat last duplet adding x to the first pixel.
0x90-0x9f 1001xxxx p Output (first pixel of last duplet) + x, then absolute pixel value p.
0xa0 0xa0 xxxxyyyy Repeat last duplet, adding x to the first pixel and y to the second.
0xb0 0xb0 xxxxyyyy Repeat last duplet, adding x to the first pixel and subtracting y from the second.
0xc0-0xcf 1100xxxx Repeat last duplet subtracting x from first pixel.
0xd0-0xdf 1101xxxx p Output (first pixel of last duplet) - x, then absolute pixel value p.
0xe0 0xe0 xxxxyyyy Repeat last duplet, subtracting x from first pixel and adding y to second.
0xf0 and 0xff 0xfx xxxxyyyy Repeat last duplet, subtracting x from first pixel and y from second.
Subcommands, part 2: repeat operations

Sometimes these repeat commands will try to copy more than what is available when the command is read. In those cases, the command repeats the available segment of data until the number of duplets needed is copied. Or, equivalently, the command starts copying data that it wrote earlier.

Command Byte pattern Action
various 1x1xxxmm mmmmmmmm
Command n r
0xa4 - 0xa7 2 0
0xa8 - 0xab 2 1
0xac - 0xaf 3 0
0xb4 - 0xb7 3 1
0xb8 - 0xbb 4 0
0xbc - 0xbf 4 1
0xe4 - 0xe7 5 0
0xe8 - 0xeb 5 1
0xec - 0xef 6 0
0xf4 - 0xf7 6 1
0xf8 - 0xfb 7 0

Repeat n duplets from relative position -m (given in pixels, not duplets). If r is 0, another byte follows and the last pixel is set to that value. n and r come from the table on the right.

0xfc 0xfc nnnnnrmm mmmmmmmm (p) Repeat n+2 duplets from relative position -m (given in pixels, not duplets). If r is 0, another byte p follows and the last pixel is set to absolute value p.

Secondary Decompression

In all Riven images, and some other images, there is no secondary decompression. Instead, the remaining data is just the pixels. For 24bpp images, the pixels are in BGR order. For 8bpp images, there are bytes per row pixels, so you will have to cut off the remaining bytes at the end of the data (bytes per row - bitmap width).

However, many non-Riven images use the RLE8 compression.

RLE8 Compression

The RLE8 compression is a rather simple form of RLE. The decoder works by decoding one row at a time, so there will be bitmap height chunks of data. Each chunk always decodes one row of data.

Per Row:

unsigned short byte_count
  • byte_count is the amount of bytes that will be used to decode the current row.

The byte_count should be ignored until later. Until you have completed a row's length of pixels, you must continue decompressing the RLE data.

Each RLE command starts with a byte. The high bit of the data represents whether or not to repeat a pixel or output direct pixels. The bottom 7 bits are the run_length minus one.

If the high bit is set, read in another byte which represents the pixel to repeat and then output the pixel run_length times. If the high bit is not set, output run_length absolute pixels from the input stream.

Once you have outputted the correct amount of pixels, you should move byte_count bytes from the start of the row in the compressed stream.