Umd Structure Revealed
Posted 12 June 2005 - 09:21 AM
I have spent quite a lot of time analyzing the umd file structure and have decoded most of the important structures of the file - enough to allow for extraction of the files within.
What follows is what I have determined so far about the layout of a umd file.
A umd is an archive of many smaller files. The files do not appear to be compressed or encrypted in any way. As anyone who has peeked at a umd with a hex editor will know, there are lots of interesting things visible and editable* in a umd. There are three basic structures of the file: a footer (not a header - but it's the same thing), an index, and the data.
Let's look at the footer first.
The useful parts I have figured out are in the last 20 bytes of the file. Really, only 12 bytes (3 DWORDs) are meaningful so far.
Here are the last 20 bytes from xboxdynamic.umd from Rainbow Six 3 (NTSC)
I've broken them up into DWORDs for illustration.
A3C5E39F E5C92500 181E2600 02000000 A350C44B
The first DWORD (A3C5E39F) is the signature/magic whatever you want to call it. This same signature appears in many game archive / package files. (Unreal-type games) The second DWORD (E5C92500) is the offset from the beginning of the file to the index. The third DWORD (181E2600) is the file size. 2498072 bytes (for xboxdynamic.umd)
So, here's what the layout of the footer is so far:
Offset Contents Use
261E04 A3C5E39F Signature/Magic
261E08 E5C92500 Offset from the beginning of the umd to the index.
261E0C 181E2600 File size (in this case it's 2498072 bytes)
261E10 02000000 Don't know yet
261E14 A350C44B Don't know yet
Next, since we have an offset to it, let's look at the index.
The index starts with a CompactIndex DWORD. Basically a CompactIndex is a series of one or more bytes that encode a small number in fewer than normal bytes to save space. It usually preceeds an index. The number it encodes represents how many items follow in the index. The encoding for a CompactIndex is fairly simple.
In the first byte, only the first six bits are part of the number. The last two bits indicate more bytes follow and the sign of the number.
The first byte of the CompactIndex for the xboxdynamic.umd is 0x58 and looks like this:
...Bit: 7 6 5 4 3 2 1 0
Binary: 0 1 0 1 1 0 0 0
The 7th bit (bit 6) is a 1 so that indicates another byte follows.
The 8th bit (bit 7) is 0 so we have a positive number.
We strip off the 7th and 8th bit, leaving us with:
....Bit: 7 6 5 4 3 2 1 0
.Binary: 0 0 0 1 1 0 0 0
Now we go on to the next byte.
The next byte is 0x08
....Bit: 7 6 5 4 3 2 1 0
.Binary: 0 0 0 0 1 0 0 0
This time we look at the 8th bit (bit 7) to see if there are more bytes.
A 0 indicates no more bytes follow.
Next, we throw away the 8th bit and shift this byte left by six bits.
Shifting left we get:
.Binary: 0 0 1 0 0 0 0 0 0 0 0 0
Finally, we add the first byte's value to the second byte's shifted value to arrive at our encoded number. This can be done nicely with an OR, but here we just need to show 512 + 24 = 536
There are 536 items in the index.
The next byte is the first byte of the first entry in the index.
In the case of xboxdynamic.umd it is 0x1A which is 26 decicmal. This is the length of the null terminated string to follow.
Next is: Template\Air-H-G3A3-G.tpt<0>
The <0> represents the null (a zero) in the above for illustration.
This string is a path and filename for a file. The folder is Template and the file is Air-H-G3A3-G.tpt.
The template files for Rainbox Six 3 are the terrorist templates. They are text files with entries for the terrorists' personality, behavior, and weapons loadout.
Following the null terminated file string is a DWORD which is the offset from the beginning of the file to the data for the Template\Air-H-G3A3-G.tpt file.
For this file, it is 00000000. (Yes - the beginning of the file - offset 0)
Next is the file size: 0xAF010000 or 431 bytes.
Next is a DWORD of padding zeros.
Then, the next entry follows and so on for the next 535 files (for xboxdynamic.umd)
So, here's the basic layout of the index:
BYTES : CompactIndex
BYTE(n) : Length of the string to follow (including the null)
n BYTES: Null terminated string of the entry
DWORD: Offset from the beginning of the umd of the file's data
DWORD: File size
DWORD: Zero padding
That's the structure I have determined so far.
Extraction is just a program away Ha! But that's for another thread's discussion.
This structure holds for Rainbow Six 3, R63 BA, and Splinter Cell - Chaos Theory.
One of the umd files (lipsyncxbox.umd) for SCCT has 10200 files in it! They are .bin files and judging from the name are likely compressed sound files of the characters' voices.
I hope you found this useful.
(* if you don't know what you're doing, I suggest not trying to hex edit your umd.)
Posted 12 June 2005 - 10:43 AM
Posted 12 June 2005 - 08:14 PM
Posted 12 June 2005 - 09:10 PM
Yep, I know. It amazes me how people try to send 10 MB files via email.
Anyway, I went ahead and signed up an msn messenger.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users