All observed Mohawk games use a common bitmap format, besides Myst (which uses WDIB and PICT images). Mohawk bitmaps are always stored with a tBMP tag. All values are in big endian order. tBMP resources are divided into three chunks: the header, the palette, and the image. Note that the palette is not always present (such as non-Riven 8bpp images and Riven 24bpp images).
- 1 Bitmap Header
- 2 Palette
- 3 Image
|unsigned short||bitmap width|
|unsigned short||bitmap height|
|unsigned short||bytes per row|
|unsigned short||compression details|
- bitmap width is the width of the bitmap in pixels.
- bitmap height is the height of the bitmap in pixels.
- bytes per row is the pitch of the image (bytes in a scan line).
- For compression details, see the next section.
Note: Both bitmap width and bitmap height are only valid for the bottom 10 bits (val & 0x3ff, in C). bytes per row is also the same, but has to remain even (val & 0x3fe, in C).
compression details is split up into different sections, depending on their purpose.
Bits 0-2 (Bits Per Pixel)
The lower three bits represent the images' bit depth. It is chosen from this table.
|Value||Bits Per Pixel|
Other values are undefined.
Bit 3 (Palette)
Bit 3 represents whether or not a palette will occur. However, this is very unreliable, as Riven never has this bit set. If the game is Riven, and the bit depth is 8, you should assume there is a palette.
Bits 4-7 (Secondary Compression)
Bits 4-7 represent the secondary compression on the image.
|3||Another (still unknown) RLE variant|
For none, the image is raw pixels. However, there are bytes per row bytes per row, so you have to skip remaining bytes at the end of the row. In Riven, there is always no secondary compression. On the contrary, in almost all other games' images, the compression is RLE8.
Bits 8-11 (Primary Compression)
Bits 8-11 represent the primary compression on the image.
|2||Another (still unknown) LZ variant|
For none, there is nothing to do at this stage. See details later for the LZ and Riven compression schemes.
While bit 3 should represent whether a palette exists or not, all 8bpp Riven images have a palette regardless. The palette starts with a short header:
- table_size is the size of the table, including the four byte header.
- bits_per_color is the bits that each color is (seems to be always 24)
- color_count is the amount of colors that the table contains (seems to be always 0xff)
Following the header is color_count blocks:
The rest of the tBMP resource is made up of the image. To decode the image, you must first use the primary decompression and then the secondary decompression to have the correct output image.
If the image has no primary decompression you can skip this step.
The stream consists of two parts, the header and the compressed data.
- uncompressed_size is the size of the data after decompressing.
- compressed_size is the size of the data before decompressing.
- dictionary_size is the size of the ring buffer to use in the decompressor. However, it is always 0x400, and if it's not 0x400, it should be thrown out.
Thanks to Petroff Heroj and Ron Hayter for working out this compression.
Until the end of the resource, each run begins with a byte. Each bit of this byte defines what to do next, starting from the least significant one.
- A 1 means an absolute byte follows. Read a byte from the compressed data and store it directly into the uncompressed buffer.
- A 0 means a length/offset pair follows. Read two bytes b1 and b2 from the compressed data. The most significant 6 bits of b1 represent the length of the run minus 3. The 2 least significant bits of b1 and the whole b2 form together a 10-bit offset into the ring buffer, minus 0x42. At this point copy length bytes from the ring buffer, starting at offset, to the uncompressed buffer. If offset is over 0x400 make sure to subtract 0x400 after adding the 0x42, i.e. loop around to the beginning of the ring buffer.
The ring-buffer should be initialized to all zeroes. Remember to store the uncompressed bytes in the ring buffer as well, looping to the beginning after 0x400 bytes.
0xf7 // decoder byte (11110111b) 0x87 // absolute byte 0x73 // absolute byte 0x27 // absolute byte 0x0b // byte 1 of the run data: length = 2 + 3 = 5 (first 6 bits + 3) 0xa9 // byte 2 of the run data: offset = 0x3a9 + 0x42 = 0x3eb 0x27 // absolute byte 0x32 // absolute byte 0x00 // absolute byte 0x4e // absolute byte
In compressed tBMP bitmaps, pixels are encoded as a data stream made of variable length commands. Pixels are always decoded in duplets: each command generates at least 2 pixels. The encoding is heavily based on what comes before each command, so even a little decoding bug can cripple the whole image. The commands can appear in any order inside the data stream. The first 4 bytes of the data stream are unknown and can be ignored. Many thanks to Arthur Muller for his precious help in decoding this format.
Like the uncompressed format, sometimes duplets are generated beyond the edge of the image. Use the bytes per row value to see how many duplets are in each row.
They are all 1-byte commands, followed by a variable number of arguments.
|0x00||End of stream: when reaching it, the decoding is complete. No additional bytes follow. I think some bitmaps don't have this, so just stop when you have decoded enough pixels to fill the image.|
|0x01 -0x3f||Output n pixel duplets, where n is the command value itself. Pixel data comes immediately after the command as 2*n bytes representing direct indices in the 8-bit color table.|
|0x40-0x7f||Repeat last 2 pixels n times, where n = command_value & 0x3F. No additional bytes follow.|
|0x80-0xbf||Repeat last 4 pixels n times, where n = command_value & 0x3F. No additional bytes follow.|
|0xc0-0xff||Begin of a subcommand stream. This is like the main command stream, but contains another set of commands which are somewhat more specific and a bit more complex. This command says that command_value & 0x3F subcommands will follow. It doesn't generate pixels itself.|
Subcommands, part 1: arithmetic operations
Subcommands are not simply 1-byte values, but are somewhat mixed with their arguments, so the full byte pattern is reported.
|0x01-0x0f||0000mmmm||Repeat duplet at relative position -m, where m is given in duplets. So if m=1, repeat the last duplet.|
|0x10||0x10 p||Repeat last duplet, but change second pixel to p.|
|0x11-0x1f||0001mmmm||Output the first pixel of last duplet, then pixel at relative position -m. m is given in pixels.|
|0x20-0x2f||0010xxxx||Repeat last duplet, but add x to second pixel.|
|0x30-0x3f||0011xxxx||Repeat last duplet, but subtract x to second pixel.|
|0x40||0x40 p||Repeat last duplet, but change first pixel to p.|
|0x41-0x4f||0100mmmm||Output pixel at relative position -m, then second pixel of last duplet.|
|0x50||0x50 p1 p2||Output two absolute pixel values, p1 and p2.|
|0x51-0x57||01010mmm p||Output pixel at relative position -m, then absolute pixel value p.|
|0x59-0x5f||01011mmm p||Output absolute pixel value p, then pixel at relative position -m.|
|0x60-0x6f||0110xxxx p||Output absolute pixel value p, then (second pixel of last duplet) + x.|
|0x70-0x7f||0111xxxx p||Output absolute pixel value p, then (second pixel of last duplet) - x.|
|0x80-0x8f||1000xxxx||Repeat last duplet adding x to the first pixel.|
|0x90-0x9f||1001xxxx p||Output (first pixel of last duplet) + x, then absolute pixel value p.|
|0xa0||0xa0 xxxxyyyy||Repeat last duplet, adding x to the first pixel and y to the second.|
|0xb0||0xb0 xxxxyyyy||Repeat last duplet, adding x to the first pixel and subtracting y from the second.|
|0xc0-0xcf||1100xxxx||Repeat last duplet subtracting x from first pixel.|
|0xd0-0xdf||1101xxxx p||Output (first pixel of last duplet) - x, then absolute pixel value p.|
|0xe0||0xe0 xxxxyyyy||Repeat last duplet, subtracting x from first pixel and adding y to second.|
|0xf0 and 0xff||0xfx xxxxyyyy||Repeat last duplet, subtracting x from first pixel and y from second.|
Subcommands, part 2: repeat operations
Sometimes these repeat commands will try to copy more than what is available when the command is read. In those cases, the command repeats the available segment of data until the number of duplets needed is copied. Or, equivalently, the command starts copying data that it wrote earlier.
Repeat n duplets from relative position -m (given in pixels, not duplets). If r is 0, another byte follows and the last pixel is set to that value. n and r come from the table on the right.
|0xfc||0xfc nnnnnrmm mmmmmmmm (p)||Repeat n+2 duplets from relative position -m (given in pixels, not duplets). If r is 0, another byte p follows and the last pixel is set to absolute value p.|
In all Riven images, and some other images, there is no secondary decompression. Instead, the remaining data is just the pixels. For 24bpp images, the pixels are in BGR order. For 8bpp images, there are bytes per row pixels, so you will have to cut off the remaining bytes at the end of the data (bytes per row - bitmap width).
However, many non-Riven images use the RLE8 compression.
The RLE8 compression is a rather simple form of RLE. The decoder works by decoding one row at a time, so there will be bitmap height chunks of data. Each chunk always decodes one row of data.
- byte_count is the amount of bytes that will be used to decode the current row.
The byte_count should be ignored until later. Until you have completed a row's length of pixels, you must continue decompressing the RLE data.
Each RLE command starts with a byte. The high bit of the data represents whether or not to repeat a pixel or output direct pixels. The bottom 7 bits are the run_length minus one.
If the high bit is set, read in another byte which represents the pixel to repeat and then output the pixel run_length times. If the high bit is not set, output run_length absolute pixels from the input stream.
Once you have outputted the correct amount of pixels, you should move byte_count bytes from the start of the row in the compressed stream.