-
Notifications
You must be signed in to change notification settings - Fork 201
Processing binary data
Sometimes the data you're working on is not only packed in some structure, but also somehow encoded, obfuscated, encrypted, compressed, etc. So, to be able to parse such data, one has to remove this layer of encryption / obfuscation / compression / etc. This is called "processing" in Kaitai Struct and it is supported with a range of process
directives. These can be applied to raw byte buffers or user-typed fields in the following way:
seq:
- id: buf1
size: 0x1000
process: zlib
This declares a field named buf1
. When parsing this structure, KS will read exactly 0x1000 bytes from a source stream and then apply zlib
processing, i.e. decompression of zlib-compressed stream. Afterwards, accessing buf1
would return decompressed stream (which would be most likely larger than 0x1000 bytes long), and accessing _raw_buf1
property would return raw (originally compressed) stream, exactly 0x1000 bytes long.
There are following processing directives available in Kaitai Struct.
Applies a bitwise XOR (bitwise exclusive "or", written as ^
in most C-like languages) to every byte of the stream. Length of output stays exactly the same as the length of input. There is one mandatory argument - the key to use for XOR operation. It can be:
- a single byte value — in this case this value would be XORed with every byte of the input stream
- an array of bytes — in this case, first byte of the input would be XORed with first byte of the key, second byte of the input with second byte of the keys, etc. If the key is shorter than the input, key will be reused, starting from the first byte.
For example, given 3-byte key [b0, b1, b2]
and input line [x0, x1, x2, x3, x4, ...]
output will be:
[x0 ^ b0, x1 ^ b1, x2 ^ b2,
x3 ^ b0, x4 ^ b1, ...]
Examples:
-
process: xor(0xaa)
— XORs every byte with0xaa
-
process: xor(7, 42)
— XORs every odd (1st, 3rd, 5th, ...) byte with7
, and every even (2nd, 4th, 6th, ...) byte with42
-
process: xor(key_buf)
— XORs bytes using a key stored in a field namedkey_buf
Does a circular shift operation on a buffer, rotating every byte by key
bits left (rol
) or right (ror
).
Examples:
-
process: rol(5)
— rotates every byte 5 bits left: every given bit combinationb0-b1-b2-b3-b4-b5-b6-b7
becomesb5-b6-b7-b0-b1-b2-b3-b4
-
process: ror(some_val)
— rotates every byte right by number of bits determined bysome_val
field (which might be either parsed previously or calculated on the fly)
Applies a zlib
decompression to input buffer, expecting it to be a full-fledged zlib stream, i.e. having a regular 2-byte zlib header. Decompression parameters are chosen automatically from it. Typical zlib header values:
-
78 01
— no compression or low compression -
78 9C
— default compression -
78 DA
— best compression
Length of output buffer is usually larger that length of the input. This processing method might throw an exception if the data given is not a valid zlib stream.