|
|
(2 intermediate revisions by one other user not shown) |
Line 1: |
Line 1: |
| {{Riven}} | | {{Riven}} |
− | This page shows the structure of Riven tWAV resources, which store game sounds and music. Though this document is complete enough to correctly decode Riven sounds (at least "by ear"), the tWAV format has a number of obscure details and unknown fields, expecially those connected to data sizes. Finally, thanks to Ron Hayter for providing details on the DVD version of tWAV resources. [[Myst MSND resources]] also use this format.
| |
| | | |
− | tWAV resources are structured in chunks, much like the common WAV audio format. The audio data can be compressed in two ways. The first is the Intel DVI ADPCM format, a (lossy) differential encoding which stores the difference between consecutive samples as 4-bit delta samples, yielding a compression factor of 4:1. The second (used in the DVD version of the game) is MPEG-2 Layer II encoding. It's important to note that the compressed data block contains more data than necessary: the additional data is just garbage at the end of the block, and is skipped by the Riven decoder when the sound is played. Unfortunately, the tWAV headers seem to ignore this excess data: this makes reverse-engineering of these fields very difficult, since every relation with the full resource size is lost. It's necessary to introduce an "effective resource size" excluding the excess data. | + | tWAV resources store Riven's audio data. See the [[Mohawk Sounds]] page for details on the format. |
− | | + | |
− | ==The header==
| + | |
− | The tWAV data block begins with the following header:
| + | |
− | {| class="structure"
| + | |
− | |4 bytes||mhwk_magic
| + | |
− | |-
| + | |
− | |unsigned long||size
| + | |
− | |-
| + | |
− | |4 bytes||wave_magic
| + | |
− | |}
| + | |
− | | + | |
− | *''mhwk_magic'' is the string 'MHWK'. It's equal to the [[Mohawk archive format#IFF header|Mohawk format signature]]!
| + | |
− | *''size'' is the effective resource size, minus the ADPC chunk size, minus 2.
| + | |
− | *''wave_magic'' is the string 'WAVE'.
| + | |
− | | + | |
− | After this header come the chunks. Until now, 3 chunk types have been identified: '[[#The ADPC chunk|ADPC]]', '[[#The Cue.23 chunk|Cue#]]' and '[[#The Data chunk|Data]]'.
| + | |
− | | + | |
− | ==The ADPC chunk==
| + | |
− | This chunk holds some information about the audio sample format. Its size is not constant.
| + | |
− | {| class="structure"
| + | |
− | |4 bytes||chunk_type
| + | |
− | |-
| + | |
− | |unsigned long||chunk_size
| + | |
− | |-
| + | |
− | |unsigned short||u0
| + | |
− | |-
| + | |
− | |unsigned short||channels
| + | |
− | |-
| + | |
− | |unsigned long||u1
| + | |
− | |-
| + | |
− | |unsigned long||u2[channels]
| + | |
− | |}
| + | |
− | | + | |
− | *''chunk_type'' is the string 'ADPC'.
| + | |
− | *''chunk_size'' is the chunk size minus 8.
| + | |
− | *''channels'' is the number of audio channels.
| + | |
− | *''u0'' is 2 when there is the Cue# chunk, and 1 when there isn't.
| + | |
− | *''u1'' is always 0.
| + | |
− | *''u2'' is always 0x00400000 for both channels.
| + | |
− | | + | |
− | If there is the Cue# chunk, then there is additional data in the ADPC chunk:
| + | |
− | {| class="structure"
| + | |
− | |unsigned long||u3
| + | |
− | |-
| + | |
− | |unsigned long||u4[channels]
| + | |
− | |}
| + | |
− | | + | |
− | *''u3'' seems to be in units of samples (maybe it's a position within the audio stream).
| + | |
− | *''u4'' looks more like a record than a single unsigned long value, but values are obscure and I have no idea about its meaning.
| + | |
− | | + | |
− | Finally, the ADPC chunk seems to be absent if the resource contains MP2 audio.
| + | |
− | | + | |
− | ==The Cue# chunk==
| + | |
− | This chunk is rare, only a few tWAV resources have it and just one resource has an interesting one (that's tWAV 3 from p_Sounds.mhk). This chunk seems to contain "cue points", in a way similar to the corresponding chunk of the WAVE format.
| + | |
− | | + | |
− | {| class="structure"
| + | |
− | |4 bytes||chunk_type
| + | |
− | |-
| + | |
− | |unsigned long||chunk_size
| + | |
− | |-
| + | |
− | |unsigned short||point_count
| + | |
− | |}
| + | |
− | | + | |
− | *''chunk_type'' is the string 'Cue#'.
| + | |
− | *''chunk_size'' is the chunk size minus 8.
| + | |
− | *''point_count'' is the number of cue points.
| + | |
− | | + | |
− | Following this fixed structure there are ''point_count'' records; each record describes a cue point with a position inside the audio stream and an associated ASCII text string:
| + | |
− | | + | |
− | {| class="structure"
| + | |
− | |unsigned long||position
| + | |
− | |-
| + | |
− | |unsigned char||name_len
| + | |
− | |-
| + | |
− | |unsigned char||name[name_len+1]
| + | |
− | |}
| + | |
− | | + | |
− | *''position'' is the cue point position within the audio stream, in units of samples.
| + | |
− | *''name_len'' is the length of the associated string.
| + | |
− | *''name'' is the associated string (zero-terminated).
| + | |
− | | + | |
− | Most Cue# chunks have ''point_count'' set to 0, so they contain nothing. tWAV 3 from p_Sounds.mhk has two cue points, named <code>Beg Loop</code> and <code>End Loop</code>. Please note that the chunk structure has been guessed from this single case, so the statistics is very poor :-\
| + | |
− | | + | |
− | I don't know if the engine uses this chunk at all.
| + | |
− | | + | |
− | ==The Data chunk==
| + | |
− | This chunk is always present since it contains the actual audio samples.
| + | |
− | | + | |
− | {| class="structure"
| + | |
− | |4 bytes||chunk_type
| + | |
− | |-
| + | |
− | |unsigned long||chunk_size
| + | |
− | |-
| + | |
− | |unsigned short||sample_rate
| + | |
− | |-
| + | |
− | |unsigned long||sample_count
| + | |
− | |-
| + | |
− | |unsigned char||bits_per_sample
| + | |
− | |-
| + | |
− | |unsigned char||channels
| + | |
− | |-
| + | |
− | |unsigned short||encoding
| + | |
− | |-
| + | |
− | |unsigned short||loop
| + | |
− | |-
| + | |
− | |unsigned long||loop_start
| + | |
− | |-
| + | |
− | |unsigned long||loop_end
| + | |
− | |-
| + | |
− | |variable||audio_data
| + | |
− | |}
| + | |
− | | + | |
− | *''chunk_type'' is the string 'Data'.
| + | |
− | *''chunk_size'' is the full chunk size, including ''chunk_type'' and ''chunk_size'' itself.
| + | |
− | *''sample_rate'' is the audio sampling rate (always 22050).
| + | |
− | *''sample_count'' is the number of audio samples.
| + | |
− | *''bits_per_sample'' is the number of bits per sample per channel.
| + | |
− | *''channels'' is the number of audio channels.
| + | |
− | *''encoding'' tells how the audio data is stored. It's 0 for PCM (see the [[MSND]] page), 1 for ADPCM, 2 for MPEG-2 Audio Layer II.
| + | |
− | *''loop'' means loop if the value is 0xFFFF (NOTE: Not used in Riven, only Myst!)
| + | |
− | *''loop_start'' is the starting point of the loop (NOTE: Not used in Riven, only Myst!)
| + | |
− | *''loop_end'' is the ending point of the loop (NOTE: Not used in Riven, only Myst!)
| + | |
− | *''audio_data'' is the audio data stream, encoded according to encoding. In case of 1-channel ADPC audio, each byte holds 2 compressed samples (higher and lower 4 bits of the byte); in case of stereo ADPC tWAVs, each byte stores one compressed sample for each channel (higher and lower 4 bits of the byte).
| + | |