Difference between revisions of "Mohawk archive format"

From A look inside The Link @ wiki
Jump to: navigation, search
(first things)
 
(RSRC header: fix the first four bytes of the RSRC header)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
{{Riven}}
 +
{{Myst}}
 
==General layout==
 
==General layout==
 
Mohawk archives are organized in chunks, and this is the chunk layout:
 
Mohawk archives are organized in chunks, and this is the chunk layout:
Line 16: Line 18:
  
 
==IFF header==
 
==IFF header==
 +
This is always at the beginning of the file.
 +
{| class="structure"
 +
|4 bytes||chunk signature (MHWK), identifies Mohawk archive format
 +
|-
 +
|unsigned long||file size in bytes, '''not''' counting this IFF header (that is, file size - 8)
 +
|}
  
 
==RSRC header==
 
==RSRC header==
 +
I think this is actually the "Resource Dir header". Note that the Dir can be anywhere in file, but this header always follows the IFF Header.
 +
{| class="structure"
 +
|4 bytes||chunk signature (RSRC)
 +
|-
 +
|unsigned short||version (always 0x100)
 +
|-
 +
|unsigned short||compaction (not used in reading from archives)
 +
|-
 +
|unsigned long||total file size in bytes
 +
|-
 +
|unsigned long||absolute offset of the Resource Dir
 +
|-
 +
|unsigned short||offset in Resource Dir of the File Table
 +
|-
 +
|unsigned short||File Table size in bytes
 +
|}
  
 
==Type table==
 
==Type table==
 +
It lists resource types in the Mohawk file. This table is always at the beginning of the Resource Dir.
 +
 +
Header:
 +
{| class="structure"
 +
|unsigned short||offset in Resource Dir of the Resource Name List (maybe this should go with the RSRC Header)
 +
|-
 +
|unsigned short||number of resource types in this file
 +
|}
 +
 +
Entry (one for each type):
 +
{| class="structure"
 +
|4 bytes||resource type (tWAV, tBMP, NAME etc.)
 +
|-
 +
|unsigned short||offset in Resource Dir of the Resource Table for this type
 +
|-
 +
|unsigned short||offset in Resource Dir of the Name Table for this type
 +
|}
  
 
==Name table==
 
==Name table==
 +
There is one name table for each resource type. Many types don't have resource names; usually only tBMP, tWAV and tMOV do. Each entry holds the name offset in the name list for a resource.
 +
 +
Header:
 +
{| class="structure"
 +
|unsigned short||number of name entries (can be zero)
 +
|}
 +
 +
Entry:
 +
{| class="structure"
 +
|unsigned short||offset of the name string in Name List
 +
|-
 +
|unsigned short||resource index (equal to the resource's index in File Table)
 +
|}
  
 
==Resource table==
 
==Resource table==
 +
Again, there is one resource table for each resource type. The resource table holds crucial information about each resource of this type.
 +
 +
Header:
 +
{| class="structure"
 +
|unsigned short||number of resources for this type (number of table entries)
 +
|}
 +
 +
Entry:
 +
{| class="structure"
 +
|unsigned short||resource ID
 +
|-
 +
|unsigned short||index in file table (starting from 1). Also used to find the matching name table entry
 +
|}
  
 
==Resource name list==
 
==Resource name list==
 +
This is a simple list of null-terminated strings (C strings), where each string is a resource name. Given a resource, you can get the offset to its name by looking in the name table entry for that resource.
  
 
==File table==
 
==File table==
 +
This table holds other important data about each resource, notably the location and size of the resource content within the whole Mohawk file. I wonder why this information couldn't go directly inside the resource table entry.
 +
 +
Header:
 +
{| class="structure"
 +
|unsigned long||number of file table entries
 +
|}
 +
 +
Entry:
 +
{| class="structure"
 +
|unsigned long||absolute offset of resource data block
 +
|-
 +
|unsigned short||resource data size, bits 15-0
 +
|-
 +
|byte||resource data size, bits 23-16
 +
|-
 +
|byte||resource flags (unknown)
 +
|-
 +
|unsigned short||unknown (usually zero in Riven files)
 +
|}
 +
 +
It should be noted that the resource data size information is incorrect for all [[Riven tMOV resources|tMOV]]s in the Riven archives. One can compensate for this by computing file lengths using the resource offsets.

Latest revision as of 00:08, 27 January 2010

Riven
Mohawk Overview
BLST CARD FLST HSPT
MLST NAME PLST RMAP
SFXE SLST tBMP tMOV
tWAV VARS VERS ZIPS
Scripts Variables
External commands
Myst
Mohawk Overview
CLRC EXIT HINT INIT
MJMP MSND PICT RLST
VIEW WDIB HELP RSFL
Scripts Variables

General layout

Mohawk archives are organized in chunks, and this is the chunk layout:

  • IFF chunk
    • IFF header
    • RSRC header
    • Resource dir
      • Type table
      • Name tables (one for each resource type)
      • Resource tables (one for each resource type)
      • Resource name list
      • File table
    • Actual data (resource contents)

Note that the chunks may be found in a different order: never trust on them being at fixed locations, and use the offsets to reach them.

Every integer is in big-endian (Motorola) byte order.

IFF header

This is always at the beginning of the file.

4 bytes chunk signature (MHWK), identifies Mohawk archive format
unsigned long file size in bytes, not counting this IFF header (that is, file size - 8)

RSRC header

I think this is actually the "Resource Dir header". Note that the Dir can be anywhere in file, but this header always follows the IFF Header.

4 bytes chunk signature (RSRC)
unsigned short version (always 0x100)
unsigned short compaction (not used in reading from archives)
unsigned long total file size in bytes
unsigned long absolute offset of the Resource Dir
unsigned short offset in Resource Dir of the File Table
unsigned short File Table size in bytes

Type table

It lists resource types in the Mohawk file. This table is always at the beginning of the Resource Dir.

Header:

unsigned short offset in Resource Dir of the Resource Name List (maybe this should go with the RSRC Header)
unsigned short number of resource types in this file

Entry (one for each type):

4 bytes resource type (tWAV, tBMP, NAME etc.)
unsigned short offset in Resource Dir of the Resource Table for this type
unsigned short offset in Resource Dir of the Name Table for this type

Name table

There is one name table for each resource type. Many types don't have resource names; usually only tBMP, tWAV and tMOV do. Each entry holds the name offset in the name list for a resource.

Header:

unsigned short number of name entries (can be zero)

Entry:

unsigned short offset of the name string in Name List
unsigned short resource index (equal to the resource's index in File Table)

Resource table

Again, there is one resource table for each resource type. The resource table holds crucial information about each resource of this type.

Header:

unsigned short number of resources for this type (number of table entries)

Entry:

unsigned short resource ID
unsigned short index in file table (starting from 1). Also used to find the matching name table entry

Resource name list

This is a simple list of null-terminated strings (C strings), where each string is a resource name. Given a resource, you can get the offset to its name by looking in the name table entry for that resource.

File table

This table holds other important data about each resource, notably the location and size of the resource content within the whole Mohawk file. I wonder why this information couldn't go directly inside the resource table entry.

Header:

unsigned long number of file table entries

Entry:

unsigned long absolute offset of resource data block
unsigned short resource data size, bits 15-0
byte resource data size, bits 23-16
byte resource flags (unknown)
unsigned short unknown (usually zero in Riven files)

It should be noted that the resource data size information is incorrect for all tMOVs in the Riven archives. One can compensate for this by computing file lengths using the resource offsets.