Wasm Binary Format Deep Dive
A .wasm file is not a text format you can grep — it is a dense, length-prefixed byte stream with a fixed header and a strict section order. When a bundler emits a 200 KB module, when a trap fires at wasm-function[7]:0x42, or when you need to confirm a build actually stripped its debug data, you end up staring at hex. This area teaches you to read that stream directly: the 8-byte preamble, the eleven well-known sections plus custom sections, the LEB128 variable-length integers that glue them together, and the tools that turn raw bytes back into something legible.
Prerequisites
Install the WebAssembly Binary Toolkit (wabt) and a byte-level hex viewer before working through any example here. Every command below assumes these are on your PATH.
- [ ]
wabt≥ 1.0.34 — provideswasm-objdump,wasm2wat, andwasm-validate(brew install wabtorapt install wabt) - [ ]
xxd(ships withvim) orhexdumpfor raw byte dumps - [ ] A
.wasmfile to inspect — build one withwat2wasm add.wat -o add.wasmfrom the snippet below - [ ] Optional:
wasm-optfrombinaryen≥ 116 to compare optimized vs unoptimized output
How the binary is laid out
Every module opens with an 8-byte header: the four-byte magic number 00 61 73 6d (the ASCII bytes for \0asm) followed by the four-byte little-endian version 01 00 00 00. A runtime that does not see those exact eight bytes rejects the module before parsing a single section. After the header comes a flat sequence of sections, each encoded as a one-byte section ID, a LEB128 byte-length prefix, and then that many payload bytes. The validator walks them strictly in the order below; the only sections allowed to appear out of order — or more than once — are custom sections (ID 0).
The known sections fall into three groups. Declarations (Type, Import, Function, Table, Memory, Global) describe the module’s shape: the signatures it uses, what it imports, how many functions it defines and at which type index, and the size of its linear memory. The interface (Export, Start) names what JavaScript can reach and which function, if any, runs at instantiation. The contents (Element, Code, Data) hold the actual bytes: table initializers, function bodies, and the data segments copied into linear memory. Custom sections carry tooling metadata — the name section for symbol names, .debug_info for DWARF — and are skipped by the validator but preserved end to end.
| ID | Section | Holds |
|---|---|---|
| 0 | Custom | names, DWARF, source-map URL |
| 1 | Type | function signatures (deduped) |
| 2 | Import | external functions, memory, globals |
| 3 | Function | type index per defined function |
| 4 | Table | funcref table limits |
| 5 | Memory | linear memory min/max pages |
| 6 | Global | global variable types + init |
| 7 | Export | public symbols |
| 8 | Start | auto-run function index |
| 9 | Element | table initializers |
| 10 | Code | function bodies + locals |
| 11 | Data | linear memory initializers |
This byte-level view is the inverse of the readable WebAssembly text format (WAT) basics: WAT is the s-expression source you write, the binary is what wat2wasm distills it into, and wasm2wat round-trips back. Keep both windows open while learning — every section here maps to a construct there.
Why the section order is fixed
The ascending order is not arbitrary; it is a single-pass parsing contract. The Type section comes first because everything downstream references type indices — the Import and Function sections both name a type by index, and they cannot do that until the type vector exists. Imports precede the Function section because imported functions occupy the low end of the function index space, so a module-defined function’s index is (number of imported functions) + (its position in the Function section). Get that base wrong and every call target resolves to the wrong function. The Code section comes near the end because a function body can call any function and load from any data segment declared earlier, so the validator wants the full symbol table assembled before it type-checks a single instruction. This is also why a streaming compiler can begin JIT-compiling function bodies the moment the Code section starts arriving over the network: every declaration it needs has already streamed past.
Vectors: the repeating shape
Most sections are a vector — a LEB128 element count followed by that many elements. The Type section is count followed by count function types; the Export section is count followed by count export entries; the Code section is count followed by count function bodies. Once you internalize “count, then elements”, the whole format collapses into a handful of nested vectors, and decoding becomes mechanical: read a count, loop that many times, recurse into each element. The only structural variation is that some elements (a function type, an export descriptor, a data segment) are themselves small fixed sequences with their own embedded vectors and LEB128 fields.
LEB128: why nothing is a fixed width
Almost every integer in a .wasm file — section lengths, vector counts, type indices, function-body sizes, and instruction immediates — is encoded as LEB128 (Little Endian Base 128), a variable-length scheme. Each byte carries seven payload bits in its low bits; the high bit (0x80) is a continuation flag. While that bit is set, more bytes follow. You decode by masking each byte with 0x7F, shifting it left by an increasing multiple of seven, and stopping at the first byte whose high bit is clear.
def decode_uleb128(data, offset):
result, shift = 0, 0
while True:
byte = data[offset]
offset += 1
result |= (byte & 0x7F) << shift
if not (byte & 0x80):
return result, offset
shift += 7
The payoff is density: the value 5 is one byte (05), and small section sizes stay one byte while still allowing a section to grow past 127 bytes without a fixed 4-byte field. Signed values (used by i32.const/i64.const immediates) use signed LEB128, which sign-extends the final group — a detail that bites anyone who decodes a negative constant as if it were unsigned. The key consequence for manual parsing: you cannot seek to a fixed offset. Sizes are data-dependent, so a single mis-decoded continuation bit desynchronizes every byte after it.
To make the boundary cases concrete: the value 127 encodes as a single byte 7f (high bit clear, payload 0x7f). The value 128 needs two bytes — 80 01 — because seven payload bits cannot hold it: the first byte is 0x80 (payload 0, continuation set), the second is 0x01 (payload 1, continuation clear), giving (1 << 7) | 0 = 128. The value 624485 is e5 8e 26. For signed LEB128, -1 is the single byte 7f as well — which is precisely why you must know a field’s signedness before decoding it, since the same byte means 127 unsigned and -1 signed. Section lengths, vector counts, and all indices are unsigned; only i32.const/i64.const immediates and a few block-type encodings are signed.
One more subtlety the spec enforces: LEB128 encodings must be in canonical (shortest) form for some contexts but the binary format tolerates trailing zero groups in others, so a hand-written encoder that pads values can still validate. Do not rely on a fixed byte width when re-encoding — always emit the minimal form to match what wat2wasm produces, or your round-trip diff will show spurious byte differences even though the decoded values agree.
Step-by-step: decode a real module
The following workflow takes a minimal adder from text to bytes and back, so you can see every byte you produce.
-
Write a tiny module in WAT. This exports one function and one
linear memorypage.(module (memory (export "mem") 1) (func (export "add") (param $a i32) (param $b i32) (result i32) (i32.add (local.get $a) (local.get $b)))) -
Assemble it to a binary.
wat2wasmwrites the exact byte stream.wat2wasm add.wat -o add.wasm -
Dump the raw bytes. Group by one byte, sixteen per row, so offsets line up with the spec.
xxd -g 1 -c 16 add.wasm -
Read the header. Confirm the first eight bytes are the magic and version before trusting anything downstream.
-
List the sections with their IDs and sizes.
wasm-objdump -hprints each section’s ID, name, byte offset, andLEB128-decoded length — your map for everything below.wasm-objdump -h add.wasm -
Disassemble the Code section.
wasm-objdump -dshows each function body with the per-instruction byte offsets used in trap addresses.wasm-objdump -d add.wasm
Worked hex example
Here is the start of the assembled add.wasm, annotated. The header is the first eight bytes; the Type section (ID 01) follows immediately, then the Function section (ID 03).
00000000: 00 61 73 6d 01 00 00 00 | magic "\0asm", version 1
00000008: 01 07 01 60 02 7f 7f 01 | sec 01 (Type), len 7: 1 type, func, 2 params i32 i32, 1 result
00000010: 7f 03 02 01 00 05 03 01 | ...i32 result | sec 03 (Function) len 2: func[0]=type 0 | sec 05 (Memory)...
Reading the Type section byte by byte: 01 is the section ID; 07 is the LEB128 length (7 payload bytes); 01 is the count of types in the vector; 60 is the function-type tag; 02 is the parameter count; 7f 7f are two i32 value types (0x7f is the i32 tag); 01 7f is one i32 result. That is the signature (param i32 i32) (result i32) — identical signatures share this one entry because the Type section deduplicates. The Function section that follows (03 02 01 00) reads: ID 03, length 2, one function in the vector, referencing type index 00. Nothing here is a fixed offset — the Memory section begins only after the Function section’s LEB128 length has been consumed.
Validation: what the format guarantees before any code runs
The binary layout exists to be validated cheaply. Before an engine JIT-compiles a single instruction, it walks the sections once and proves a set of structural invariants: every type index points into the Type vector, every function body’s stack is balanced and well-typed at every branch, every block/loop/if is closed by a matching end, and every memory access immediate is within the encoding’s range. Because branches are structured (you cannot jump to an arbitrary byte offset, only to an enclosing label), the validator never has to build a general control-flow graph — the nesting is the graph. That is what lets a browser accept untrusted bytes from the network and still guarantee the module cannot escape its sandbox. A trap at runtime (out-of-bounds load, integer divide by zero, unreachable) is the only failure the format permits to slip past validation, and even then it halts the instance rather than corrupting the host.
This shifts a whole class of bugs left into the build. Running wasm-validate in CI catches a malformed Code section or a dangling type index at build time, with a byte offset, instead of as an opaque CompileError in a user’s browser. The validation pass is also why size optimization is safe: a tool can delete, reorder, or rewrite instructions and the validator will reject any transform that breaks the stack discipline.
Size on the wire: what each section costs
When you profile a .wasm bundle, the section breakdown tells you where the bytes went. For most compiled modules the Code section dominates — it is the actual instruction stream — followed by Data (string literals and static buffers copied into linear memory) and, in debug builds, the custom sections holding the name table and DWARF. Stripping debug data with wasm-opt --strip-debug commonly removes 40–60% of a development build because those custom sections carry every source path, line number, and local-variable name. The structural sections (Type, Import, Function, Export) are usually a rounding error by comparison, though aggressive export of every internal symbol bloats the Export section and defeats dead-code elimination.
The format’s length-prefixing is what makes this analysis fast and what makes shrinking safe: because each section is self-delimiting, wasm-opt can drop the name section wholesale without touching a byte of Code, and wasm-objdump -h can report exact per-section sizes without parsing the instructions. The mechanics of squeezing those sections — -Oz, --strip-debug, and measuring the result — are the subject of reducing Wasm bundle size with wasm-opt.
Binding the parsed memory back to JavaScript
The Memory section you just decoded (min/max pages, where a page is 64 KiB) is exactly what JavaScript receives as instance.exports.mem. Decoding it by hand and then reading the same bytes from JS closes the loop between the binary layout and the runtime.
const bytes = await fetch("/add.wasm").then((r) => r.arrayBuffer());
const { instance } = await WebAssembly.instantiate(bytes);
const mem = new Uint8Array(instance.exports.mem.buffer); // one 64 KiB page, as the Memory section declared
console.log(instance.exports.add(2, 40)); // 42 — the Code section you disassembled, executed
Gotchas & failure modes
- A compression wrapper masquerades as corruption. If
xxdshows1f 8b(gzip) or the magic is absent, the file was served compressed.WebAssembly.Module()reportsCompileError: magic header not detected. Runfile add.wasmand decompress before parsing. - Signed vs unsigned
LEB128. Decoding ani32.const -1immediate (7f) as unsigned yields127, not-1. Constant immediates are signed; section lengths and indices are unsigned. Use the right decoder per field. 0xFC/0xFDprefix bytes are not unknown opcodes. They introduce the bulk-memory/table (0xFC) and SIMD (0xFD) instruction families; a real opcode follows as a secondLEB128value. Treating the prefix as a standalone instruction desynchronizes the stream.- Custom sections can sit anywhere. A
nameor.debug_infosection may appear between known sections, so a parser that assumes strict monotonic IDs will choke. Always branch on ID0and skip by its declared length.
Verification
Cross-check every manual parse against the toolchain. wasm-validate confirms the module is well formed; wasm-objdump -x prints the full structured view (headers, types, imports, exports, memory); wasm-objdump -d gives the per-offset disassembly that matches trap addresses; and wasm2wat round-trips the binary back to text you can diff against your source.
wasm-validate add.wasm # silent on success, error with offset on failure
wasm-objdump -x add.wasm # structured section-by-section view
wasm2wat add.wasm | diff - add.wat # confirm round-trip parity
If your hand-decoded section boundaries disagree with wasm-objdump -h, trust the tool and hunt for a mis-decoded LEB128 length — that is the cause in nearly every case.
In this guide
- Decoding Wasm opcodes for debugging — turn a
wasm-function[i]:0xNNNNtrap address into the exact failing instruction. - How to decode .wasm files manually — parse the header, a section, and a
LEB128value by hand from a raw hex dump.
Frequently Asked Questions
Why is the version field four bytes if it is always 1?
The version is a little-endian u32 (01 00 00 00), fixed-width by design so a runtime can reject the module on the first eight bytes without any LEB128 decoding. It has stayed at 1 since the MVP; proposals add sections and opcodes rather than bumping it, so the four bytes remain reserved headroom.
Do sections have to appear in numeric ID order?
The known sections (IDs 1–11) must appear at most once and in ascending ID order. Custom sections (ID 0) are the exception — any number of them may appear at any position, which is how the name section and DWARF debug data are interleaved.
How do I tell the Code section from the Data section in a hex dump?
By the section ID byte that precedes each one: Code is 0x0A, Data is 0x0B. Both are length-prefixed with a LEB128 size, so once you have read the Code section’s declared length you land exactly on the Data section’s ID byte. Use wasm-objdump -h to confirm the offsets.
Why does my parser desync after the first function?
Almost always a mis-decoded LEB128: either you stopped at the wrong continuation bit, or you read a signed immediate as unsigned. Because every subsequent offset is computed from the previous length, one wrong byte cascades. Re-run with the absolute offset logged at each step and compare against wasm-objdump -d.
Related
- WebAssembly text format (WAT) basics — the readable source the binary is assembled from.
- Debugging and profiling Wasm modules — DWARF, source maps, and DevTools on top of the binary layout.
- Stack vs heap execution model — what the decoded Code and Memory sections do at runtime.
- Reducing Wasm bundle size with wasm-opt — shrinking the very sections you just dissected.