https://insidethelink.ortiche.net/wiki/index.php?title=Special:NewPages&feed=atom&hideredirs=1&limit=50&offset=&namespace=0&username=&tagfilter=A look inside The Link @ wiki - New pages [en]2024-03-29T08:46:01ZFrom A look inside The Link @ wikiMediaWiki 1.23.15https://insidethelink.ortiche.net/wiki/index.php/Old_Mohawk_archive_formatOld Mohawk archive format2009-09-10T17:53:48Z<p>Clone2727: minor cleanup</p>
<hr />
<div>Some of the very old Mohawk games do not use exactly the [[Mohawk archive format|same format]] as the rest of the Mohawk games. However, it's really the same thing minus the MHWK headers and the resource table/file table merging. In fact, many of the games using the older format were just ported directly into Mohawk files later with only minor changes.<br />
<br />
The files are have the extension <tt>.ibm</tt> or <tt>.mac</tt>, being PC and Mac versions, respectively. The biggest difference between these two are endianness, but there are some other minor differences as well. Throughout this, I will use the same wording as the [[Mohawk archive format|regular archive format]] to show where the differences are. The Name Table is always missing here, but it looks like they left room for it, hence the u0 in each type table entry.<br />
<br />
= Windows Format (.ibm) =<br />
All values are in Little Endian format. Including the FourCC's! This is important so that it matches up.<br />
<br />
== Header ==<br />
{| class="structure"<br />
|unsigned long||absolute offset of the Resource Dir<br />
|-<br />
|unsigned short||size of Resource Dir (analogous to the size of the File table)<br />
|}<br />
<br />
The offset of the resource directory seems to be always 6.<br />
<br />
== Type Table ==<br />
At the beginning of the Resource Dir is the Type table.<br />
<br />
{| class="structure"<br />
|unsigned short||number of resource types in this file<br />
|}<br />
<br />
Entry (one for each type):<br />
{| class="structure"<br />
|4 bytes||resource type ([[Other Games#BMAP|BMAP]], [[Other Games#WAV|WAV ]] (with a space), [[Other Games#VRSN|VRSN]] etc.)<br />
|-<br />
|unsigned short||offset in Resource Dir of the Resource Table for this type<br />
|-<br />
|unsigned short||u0<br />
|}<br />
<br />
The offset is the offset within the Resource Dir, not the absolute offset. u0 appears to always be 0.<br />
<br />
== Resource Table ==<br />
Header:<br />
{| class="structure"<br />
|unsigned short||number of resources for this type (number of table entries)<br />
|}<br />
<br />
Entry:<br />
{| class="structure"<br />
|unsigned short||resource ID<br />
|-<br />
|unsigned long||absolute offset of resource data block<br />
|-<br />
|unsigned short||resource data size<br />
|-<br />
|unsigned long||u0<br />
|}<br />
<br />
This circumvents having the file table, as it stores the offset and length here. u0 appears to always be 0.<br />
<br />
= Macintosh Format (.mac) =<br />
All values are in Big Endian format.<br />
<br />
== Header ==<br />
{| class="structure"<br />
|unsigned long||absolute offset of the Resource Dir<br />
|-<br />
|unsigned short||size of Resource Dir (analogous to the size of the File table)<br />
|}<br />
<br />
The offset of the resource directory seems to be always 6.<br />
<br />
== Type Table ==<br />
At the beginning of the Resource Dir is the Type table.<br />
<br />
{| class="structure"<br />
|unsigned short||number of resource types in this file<br />
|}<br />
<br />
Entry (one for each type):<br />
{| class="structure"<br />
|4 bytes||resource type ([[Other Games#BMAP|BMAP]], [[Other Games#WAV|WAV ]] (with a space), [[Other Games#VRSN|VRSN]] etc.)<br />
|-<br />
|unsigned long||offset in Resource Dir of the Resource Table for this type<br />
|-<br />
|unsigned long||u0<br />
|}<br />
<br />
The offset is the offset within the Resource Dir, not the absolute offset. u0 appears to always be 0.<br />
<br />
== Resource Table ==<br />
Header:<br />
{| class="structure"<br />
|unsigned short||number of resources for this type (number of table entries)<br />
|}<br />
<br />
Entry:<br />
{| class="structure"<br />
|unsigned short||resource ID<br />
|-<br />
|unsigned long||absolute offset of resource data block<br />
|-<br />
|byte||resource data size, bits 23-16<br />
|-<br />
|unsigned short||resource data size, bits 15-0<br />
|-<br />
|unsigned long||u0<br />
|-<br />
|byte||u1<br />
|}<br />
<br />
This circumvents having the file table, as it stores the offset and length here. u0 and u1 appear to always be 0. u1 is probably for alignment.</div>Clone2727https://insidethelink.ortiche.net/wiki/index.php/Mohawk_BitmapsMohawk Bitmaps2009-08-17T20:41:38Z<p>Tahg: /* RLE8 Compression */ edited for clarity</p>
<hr />
<div>{{Myst}}<br />
{{Riven}}<br />
All observed Mohawk games use a common bitmap format, besides Myst (which uses [[WDIB]] and [[Myst PICT resources|PICT]] images). Mohawk bitmaps are always stored with a tBMP tag. All values are in big endian order. tBMP resources are divided into three chunks: the header, the palette, and the image. Note that the palette is not always present (such as non-Riven 8bpp images and Riven 24bpp images).<br />
<br />
== Bitmap Header ==<br />
{| class="structure"<br />
|unsigned short||bitmap width<br />
|-<br />
|unsigned short||bitmap height<br />
|-<br />
|unsigned short||bytes per row<br />
|-<br />
|unsigned short||compression details<br />
|}<br />
<br />
* ''bitmap width'' is the width of the bitmap in pixels.<br />
* ''bitmap height'' is the height of the bitmap in pixels.<br />
* ''bytes per row'' is the pitch of the image (bytes in a scan line).<br />
* For ''compression details'', see the next section.<br />
<br />
'''Note''': Both ''bitmap width'' and ''bitmap height'' are only valid for the bottom 10 bits (val & 0x3ff, in C). ''bytes per row'' is also the same, but has to remain even (val & 0x3fe, in C).<br />
<br />
=== Compression Details ===<br />
''compression details'' is split up into different sections, depending on their purpose.<br />
<br />
==== Bits 0-2 (Bits Per Pixel) ====<br />
The lower three bits represent the images' bit depth. It is chosen from this table.<br />
<br />
{| border="1"<br />
|- style="background:silver"<br />
|Value||Bits Per Pixel<br />
|-<br />
|0||1<br />
|-<br />
|1||4<br />
|-<br />
|2||8<br />
|-<br />
|3||16<br />
|-<br />
|4||24<br />
|}<br />
<br />
Other values are undefined.<br />
<br />
==== Bit 3 (Palette) ====<br />
Bit 3 represents whether or not a palette will occur. However, this is very unreliable, as Riven never has this bit set. If the game is Riven, and the bit depth is 8, you should assume there is a palette.<br />
<br />
==== Bits 4-7 (Secondary Compression) ====<br />
Bits 4-7 represent the secondary compression on the image. <br />
<br />
{| border="1"<br />
|- style="background:silver"<br />
|Value||Secondary Compression<br />
|-<br />
|0||None<br />
|-<br />
|1||RLE8<br />
|-<br />
|3||Another (still unknown) RLE variant<br />
|}<br />
<br />
For none, the image is raw pixels. However, there are ''bytes per row'' bytes per row, so you have to skip remaining bytes at the end of the row. In Riven, there is always no secondary compression. On the contrary, in almost all other games' images, the compression is RLE8.<br />
<br />
==== Bits 8-11 (Primary Compression) ====<br />
Bits 8-11 represent the primary compression on the image.<br />
<br />
{| border="1"<br />
|- style="background:silver"<br />
|Value||Secondary Compression<br />
|-<br />
|0||None<br />
|-<br />
|1||LZ<br />
|-<br />
|2||Another (still unknown) LZ variant<br />
|-<br />
|4||Riven<br />
|}<br />
<br />
For none, there is nothing to do at this stage. See details later for the LZ and Riven compression schemes.<br />
<br />
== Palette ==<br />
While bit 3 should represent whether a palette exists or not, all 8bpp Riven images have a palette regardless. The palette starts with a short header:<br />
<br />
{| class="structure"<br />
|unsigned short||table_size<br />
|-<br />
|byte||bits_per_color<br />
|-<br />
|byte||color_count<br />
|}<br />
<br />
* ''table_size'' is the size of the table, including the four byte header.<br />
* ''bits_per_color'' is the bits that each color is (seems to be always 24)<br />
* ''color_count'' is the amount of colors that the table contains (seems to be always 0xff)<br />
<br />
Following the header is ''color_count'' blocks:<br />
<br />
{| class="structure"<br />
|byte||blue_component<br />
|-<br />
|byte||green_component<br />
|-<br />
|byte||red_component<br />
|}<br />
<br />
== Image ==<br />
<br />
The rest of the tBMP resource is made up of the image. To decode the image, you must first use the primary decompression and then the secondary decompression to have the correct output image.<br />
<br />
=== Primary Decompression ===<br />
<br />
If the image has no primary decompression you can skip this step.<br />
<br />
==== LZ Decompression ====<br />
<br />
The stream consists of two parts, the header and the compressed data.<br />
<br />
===== LZ Header =====<br />
<br />
{| class="structure"<br />
|unsigned long||uncompressed_size<br />
|-<br />
|unsigned long||compressed_size<br />
|-<br />
|unsigned short||dictionary_size<br />
|}<br />
<br />
* ''uncompressed_size'' is the size of the data after decompressing.<br />
* ''compressed_size'' is the size of the data before decompressing.<br />
* ''dictionary_size'' is the size of the ring buffer to use in the decompressor. However, it is '''''always''''' 0x400, and if it's not 0x400, it should be thrown out.<br />
<br />
===== LZ Compression =====<br />
Thanks to Petroff Heroj and Ron Hayter for working out this compression.<br />
<br />
Until the end of the resource, each run begins with a byte. Each bit of this byte defines what to do next, starting from the least significant one.<br />
* A 1 means an absolute byte follows. Read a byte from the compressed data and store it directly into the uncompressed buffer.<br />
* A 0 means a length/offset pair follows. Read two bytes ''b1'' and ''b2'' from the compressed data. The most significant 6 bits of ''b1'' represent the length of the run minus 3. The 2 least significant bits of ''b1'' and the whole ''b2'' form together a 10-bit offset into the ring buffer, minus 0x42. At this point copy ''length'' bytes from the ring buffer, starting at ''offset'', to the uncompressed buffer. If ''offset'' is over 0x400 make sure to subtract 0x400 after adding the 0x42, i.e. loop around to the beginning of the ring buffer.<br />
The ring-buffer should be initialized to all zeroes. Remember to store the uncompressed bytes in the ring buffer as well, looping to the beginning after 0x400 bytes.<br />
<br />
For example:<br />
<pre><br />
0xf7 // decoder byte (11110111b)<br />
0x87 // absolute byte<br />
0x73 // absolute byte<br />
0x27 // absolute byte<br />
0x0b // byte 1 of the run data: length = 2 + 3 = 5 (first 6 bits + 3)<br />
0xa9 // byte 2 of the run data: offset = 0x3a9 + 0x42 = 0x3eb<br />
0x27 // absolute byte<br />
0x32 // absolute byte<br />
0x00 // absolute byte<br />
0x4e // absolute byte<br />
</pre><br />
<br />
==== Riven Decompression ====<br />
In compressed tBMP bitmaps, pixels are encoded as a data stream made of variable length commands. Pixels are always decoded in duplets: each command generates at least 2 pixels. The encoding is heavily based on what comes before each command, so even a little decoding bug can cripple the whole image. The commands can appear in any order inside the data stream. The first 4 bytes of the data stream are unknown and can be ignored. Many thanks to Arthur Muller for his precious help in decoding this format.<br />
<br />
Like the uncompressed format, sometimes duplets are generated beyond the edge of the image. Use the ''bytes per row'' value to see how many duplets are in each row.<br />
<br />
===== Main Commands =====<br />
<br />
They are all 1-byte commands, followed by a variable number of arguments.<br />
<br />
{| border=1 cellpadding=4 cellspacing=0 style="border:1px #000 solid;border-collapse:collapse;"<br />
|- style="background:#CCC"<br />
! Command !! Action<br />
|-<br />
|style="font-family:monospace"|0x00||End of stream: when reaching it, the decoding is complete. No additional bytes follow. I think some bitmaps don't have this, so just stop when you have decoded enough pixels to fill the image.<br />
|-<br />
|style="font-family:monospace"|0x01&nbsp;-0x3f||Output ''n'' pixel duplets, where ''n'' is the command value itself. Pixel data comes immediately after the command as 2*''n'' bytes representing direct indices in the 8-bit color table.<br />
|-<br />
|style="font-family:monospace"|0x40-0x7f||Repeat last 2 pixels ''n'' times, where ''n'' = ''command_value'' & 0x3F. No additional bytes follow.<br />
|-<br />
|style="font-family:monospace"|0x80-0xbf||Repeat last 4 pixels ''n'' times, where ''n'' = ''command_value'' & 0x3F. No additional bytes follow.<br />
|-<br />
|style="font-family:monospace"|0xc0-0xff||Begin of a subcommand stream. This is like the main command stream, but contains another set of commands which are somewhat more specific and a bit more complex. This command says that ''command_value'' & 0x3F subcommands will follow. It doesn't generate pixels itself.<br />
|}<br />
<br />
===== Subcommands, part 1: arithmetic operations =====<br />
Subcommands are not simply 1-byte values, but are somewhat mixed with their arguments, so the full byte pattern is reported.<br />
{| border=1 cellpadding=4 cellspacing=0 style="border:1px #000 solid;border-collapse:collapse;"<br />
|- style="background:#CCC"<br />
! Command !! Byte pattern !! Action<br />
|-<br />
|style="font-family:monospace"|0x01-0x0f<br />
|style="font-family:monospace"|0000mmmm<br />
|Repeat duplet at relative position -''m'', where ''m'' is given in duplets. So if ''m''=1, repeat the last duplet.<br />
|-<br />
|style="font-family:monospace"|0x10<br />
|style="font-family:monospace"|0x10 p<br />
|Repeat last duplet, but change second pixel to ''p''.<br />
|-<br />
|style="font-family:monospace"|0x11-0x1f<br />
|style="font-family:monospace"|0001mmmm<br />
|Output the first pixel of last duplet, then pixel at relative position -''m''. ''m'' is given in pixels.<br />
|-<br />
|style="font-family:monospace"|0x20-0x2f<br />
|style="font-family:monospace"|0010xxxx<br />
|Repeat last duplet, but add ''x'' to second pixel.<br />
|-<br />
|style="font-family:monospace"|0x30-0x3f<br />
|style="font-family:monospace"|0011xxxx<br />
|Repeat last duplet, but subtract ''x'' to second pixel.<br />
|-<br />
|style="font-family:monospace"|0x40<br />
|style="font-family:monospace"|0x40 p<br />
|Repeat last duplet, but change first pixel to ''p''.<br />
|-<br />
|style="font-family:monospace"|0x41-0x4f<br />
|style="font-family:monospace"|0100mmmm<br />
|Output pixel at relative position -''m'', then second pixel of last duplet.<br />
|-<br />
|style="font-family:monospace"|0x50<br />
|style="font-family:monospace"|0x50 p1 p2<br />
|Output two absolute pixel values, ''p1'' and ''p2''.<br />
|-<br />
|style="font-family:monospace"|0x51-0x57<br />
|style="font-family:monospace"|01010mmm p<br />
|Output pixel at relative position -''m'', then absolute pixel value ''p''.<br />
|-<br />
|style="font-family:monospace"|0x59-0x5f<br />
|style="font-family:monospace"|01011mmm p<br />
|Output absolute pixel value ''p'', then pixel at relative position -''m''.<br />
|-<br />
|style="font-family:monospace"|0x60-0x6f<br />
|style="font-family:monospace"|0110xxxx p<br />
|Output absolute pixel value ''p'', then (second pixel of last duplet) + ''x''.<br />
|-<br />
|style="font-family:monospace"|0x70-0x7f<br />
|style="font-family:monospace"|0111xxxx p<br />
|Output absolute pixel value ''p'', then (second pixel of last duplet) - ''x''.<br />
|-<br />
|style="font-family:monospace"|0x80-0x8f<br />
|style="font-family:monospace"|1000xxxx<br />
|Repeat last duplet adding ''x'' to the first pixel.<br />
|-<br />
|style="font-family:monospace"|0x90-0x9f<br />
|style="font-family:monospace"|1001xxxx p<br />
|Output (first pixel of last duplet) + ''x'', then absolute pixel value ''p''.<br />
|-<br />
|style="font-family:monospace"|0xa0<br />
|style="font-family:monospace"|0xa0 xxxxyyyy<br />
|Repeat last duplet, adding ''x'' to the first pixel and ''y'' to the second.<br />
|-<br />
|style="font-family:monospace"|0xb0<br />
|style="font-family:monospace"|0xb0 xxxxyyyy<br />
|Repeat last duplet, adding ''x'' to the first pixel and subtracting ''y'' from the second.<br />
|-<br />
|style="font-family:monospace"|0xc0-0xcf<br />
|style="font-family:monospace"|1100xxxx<br />
|Repeat last duplet subtracting ''x'' from first pixel.<br />
|-<br />
|style="font-family:monospace"|0xd0-0xdf<br />
|style="font-family:monospace"|1101xxxx p<br />
|Output (first pixel of last duplet) - ''x'', then absolute pixel value ''p''.<br />
|-<br />
|style="font-family:monospace"|0xe0<br />
|style="font-family:monospace"|0xe0 xxxxyyyy<br />
|Repeat last duplet, subtracting ''x'' from first pixel and adding ''y'' to second.<br />
|-<br />
|style="font-family:monospace"|0xf0 and 0xff<br />
|style="font-family:monospace"|0xfx xxxxyyyy<br />
|Repeat last duplet, subtracting ''x'' from first pixel and ''y'' from second.<br />
|}<br />
<br />
===== Subcommands, part 2: repeat operations =====<br />
<br />
Sometimes these repeat commands will try to copy more than what is available when the command is read. In those cases, the command repeats the available segment of data until the number of duplets needed is copied. Or, equivalently, the command starts copying data that it wrote earlier.<br />
<br />
{| border=1 cellpadding=4 cellspacing=0 style="border:1px #000 solid;border-collapse:collapse;"<br />
|- style="background:#CCC"<br />
! Command !! Byte pattern !! Action<br />
|-<br />
|various<br />
|style="font-family:monospace"|1x1xxxmm mmmmmmmm<br />
|<br />
{| border=1 cellspacing=0 style="float:right;border:none;border-collapse:collapse;padding:0px 3px;"<br />
|- style="background:#CCC"<br />
! Command !! n !! r<br />
|-<br />
|style="font-family:monospace"|0xa4&nbsp;-&nbsp;0xa7||2||0<br />
|-<br />
|style="font-family:monospace"|0xa8 - 0xab||2||1<br />
|-<br />
|style="font-family:monospace"|0xac - 0xaf||3||0<br />
|-<br />
|style="font-family:monospace"|0xb4 - 0xb7||3||1<br />
|-<br />
|style="font-family:monospace"|0xb8 - 0xbb||4||0<br />
|-<br />
|style="font-family:monospace"|0xbc - 0xbf||4||1<br />
|-<br />
|style="font-family:monospace"|0xe4 - 0xe7||5||0<br />
|-<br />
|style="font-family:monospace"|0xe8 - 0xeb||5||1<br />
|-<br />
|style="font-family:monospace"|0xec - 0xef||6||0<br />
|-<br />
|style="font-family:monospace"|0xf4 - 0xf7||6||1<br />
|-<br />
|style="font-family:monospace"|0xf8 - 0xfb||7||0<br />
|}<br />
Repeat n duplets from relative position -''m'' (given in pixels, not duplets). If ''r'' is 0, another byte follows and the last pixel is set to that value. ''n'' and ''r'' come from the table on the right.<br />
|-<br />
|style="font-family:monospace"|0xfc<br />
|style="font-family:monospace"|0xfc nnnnnrmm mmmmmmmm (p)<br />
|Repeat n+2 duplets from relative position -''m'' (given in pixels, not duplets). If ''r'' is 0, another byte ''p'' follows and the last pixel is set to absolute value ''p''.<br />
|}<br />
<br />
=== Secondary Decompression ===<br />
In all Riven images, and some other images, there is no secondary decompression. Instead, the remaining data is just the pixels. For 24bpp images, the pixels are in BGR order. For 8bpp images, there are ''bytes per row'' pixels, so you will have to cut off the remaining bytes at the end of the data (''bytes per row'' - ''bitmap width'').<br />
<br />
However, many non-Riven images use the RLE8 compression.<br />
<br />
==== RLE8 Compression ====<br />
The RLE8 compression is a rather simple form of RLE. The decoder works by decoding one row at a time, so there will be ''bitmap height'' chunks of data. Each chunk '''''always''''' decodes one row of data. <br />
<br />
Per Row:<br />
<br />
{| class="structure"<br />
|unsigned short||byte_count<br />
|}<br />
<br />
* ''byte_count'' is the amount of bytes that will be used to decode the current row.<br />
<br />
The ''byte_count'' should be ignored until later. Until you have completed a row's length of pixels, you must continue decompressing the RLE data.<br />
<br />
Each RLE command starts with a byte. The high bit of the data represents whether or not to repeat a pixel or output direct pixels. The bottom 7 bits are the ''run_length'' minus one.<br />
<br />
If the high bit is set, read in another byte which represents the pixel to repeat and then output the pixel ''run_length'' times. If the high bit is not set, output ''run_length'' absolute pixels from the input stream.<br />
<br />
Once you have outputted the correct amount of pixels, you should move ''byte_count'' bytes from the start of the row in the compressed stream.</div>Clone2727