Debugging & Profiling Wasm Modules

A release .wasm is a wall of numeric opcodes with no function names, no variable names, and no line numbers. When it traps or returns the wrong answer, the default stack trace points at wasm-function[37] at byte offset 0x2c41 — useless. This area is about wiring up the debug information that turns that wall back into your original Rust or C++ source: embedding DWARF, loading the Chrome DevTools C/C++ extension, setting breakpoints in the original file, reading linear memory at a pointer, and then profiling compile, instantiate, and steady-state run time so you optimize the part that actually costs.

Prerequisites

  • [ ] Chrome 119+ (or Edge) with WebAssembly Debugging: Enable DWARF support turned on in DevTools → Settings → Experiments
  • [ ] The C/C++ DevTools Support (DWARF) extension installed from the Chrome Web Store
  • [ ] A build that emits debug info: Rust via wasm-pack build --dev (or a raw cargo build without --release), C/C++ via emcc -g (DWARF) or -gsource-map (source maps)
  • [ ] wabt installed for verification — wasm-objdump, wasm-validate on your PATH
  • [ ] For Rust panics, the console_error_panic_hook crate added as a dependency
  • [ ] A local dev server serving the .wasm with the correct application/wasm MIME type

How debug info flows from source to DevTools

Debugging Wasm works because the compiler can embed a description of your original source — file names, line numbers, variable layouts — directly into the binary, and DevTools knows how to read it back. There are two encodings. DWARF is the same format native debuggers use; the toolchain packs it into custom sections (.debug_info, .debug_line, and friends) inside the .wasm. A source map is the lighter web-native alternative — a separate .wasm.map JSON file that maps byte offsets to source positions but carries no variable type information.

The other piece is the name section — a custom section that gives each function and local a readable name. Without it, DevTools and wasm-objdump fall back to numeric placeholders like $func23. With it, even a stripped binary shows greet instead of wasm-function[7].

flowchart LR subgraph src[Your source] A[lib.rs / main.cpp] end subgraph build[Compile with debug info] B["rustc / emcc<br/>-g or --dev"] end subgraph wasm[".wasm binary"] C["code section"] D[".debug_* sections (DWARF)"] E["name section"] end subgraph dt[Chrome DevTools] F["C/C++ DWARF extension"] G["Sources panel<br/>original file + breakpoints"] H["Scope / Memory inspector"] end A --> B --> C B --> D B --> E D --> F --> G E --> G C --> H G --> H

The flow is one-directional: the compiler is the only thing that knows how machine-level offsets relate to your source, so if you do not ask it to record that mapping at build time, no tool can reconstruct it afterward. This is the same custom-section machinery described in the Wasm binary format deep dive — debug data rides in optional sections the engine ignores at runtime.

It helps to be precise about which problem each artifact solves. The name section is the cheapest and most universal: a few kilobytes that map function and local indices to identifiers, enough to turn a stack trace from wasm-function[23] into parse_header. It costs almost nothing and is worth keeping even in many production builds when you want intelligible crash reports. The .debug_line section adds the offset-to-line mapping that makes breakpoints bind. The .debug_info and .debug_str sections add the full type system — struct layouts, enum variants, generic instantiations — which is what lets the Scope pane render a Vec<Token> as an expandable list rather than an opaque i32. You pay for that richness in bytes, and the three layers can be enabled independently, which is why “I have function names but no variables” is a common and explicable state.


Step-by-step: from blind binary to source-level debugging

1. Build with debug information

For Rust, a development build keeps DWARF and the name section; a release build strips both by default.

# Rust: keep DWARF + name section, no optimization stripping
wasm-pack build --dev --target web

# C/C++: emit DWARF into the .wasm
emcc app.c -g -gdwarf-5 -o app.js

2. Confirm the debug sections are present

Before opening DevTools, verify the binary actually carries what you need. A missing section here explains almost every “breakpoints don’t bind” report.

wasm-objdump -h pkg/app_bg.wasm

You want to see name and several .debug_* custom sections in the output (see Verification below). If they are absent, the build stripped them.

3. Enable DWARF support and load the extension

In DevTools → Settings → Experiments, tick WebAssembly Debugging: Enable DWARF support, then install the C/C++ DevTools Support (DWARF) extension. The extension is what parses the .debug_info section and resolves variable types; without it you get line-level breakpoints but no typed variable inspection.

4. Set a breakpoint in the original source

Reload the page. In the Sources panel the original lib.rs / main.cpp now appears under a file tree (not a wasm:// pseudo-path). Click a line number to set a breakpoint. When execution reaches it, the call stack shows real function names and you can step with the usual F10 / F11 controls.

5. Inspect state at the breakpoint

The Scope pane lists locals with their resolved types. For anything behind a pointer — a slice, a Vec, a struct field — open the Memory inspector and jump to the pointer value to read the raw bytes in linear memory. Reading memory directly is covered end-to-end in inspecting Wasm memory in Chrome DevTools.

6. Capture panics with a readable trace

A Rust panic! in Wasm aborts with a bare unreachable trap and an opaque message unless you install a hook that routes the panic payload and backtrace to console.error.

use wasm_bindgen::prelude::*;

#[wasm_bindgen(start)]
pub fn main() {
    console_error_panic_hook::set_once();
}

After this, an out-of-bounds index logs the panic message and a symbolicated stack instead of RuntimeError: unreachable executed. The reason the default is so unhelpful is structural: a Rust panic that is not caught compiles to the unreachable instruction, which traps the engine with no payload. The hook intercepts the panic before it reaches that instruction, formats the message and backtrace as a string, and hands it to console.error through an import — so the readable trace is a JavaScript-side artifact, not something the Wasm runtime produces on its own. For C and C++, the analogous tool is Emscripten’s ASSERTIONS=2 and SAFE_HEAP=1 settings, which compile in runtime checks that fire a descriptive abort instead of silently corrupting memory.


A loading example that keeps debugging working

How you instantiate the module affects what you can debug. Use streaming instantiation so DevTools sees the network response and can attach the source map, and keep the module URL stable so cached breakpoints rebind across reloads.

// Serve app_bg.wasm with Content-Type: application/wasm so streaming works.
const imports = {
  // wasm-bindgen fills the real import object; shown here for illustration.
  wbg: { /* generated glue */ },
};

const response = fetch("/pkg/app_bg.wasm");
const { instance } = await WebAssembly.instantiateStreaming(response, imports);

// Expose the memory so the Memory inspector and your console both see the same buffer.
globalThis.__wasm = instance.exports;
console.log("memory pages:", instance.exports.memory.buffer.byteLength / 65536);

If a source map is present, the compiler writes a sourceMappingURL custom section pointing at app.wasm.map; DevTools fetches it automatically when you open the module. For raw DWARF there is no separate file — everything is already inside the .wasm, which is why those binaries are much larger. The mechanics of embedding and using DWARF versus a source map are detailed in debugging Wasm with DWARF and source maps.


Profiling: compile time, instantiate time, and throughput

Debug info answers “what is wrong”; profiling answers “what is slow”. There are three distinct costs to measure, and they trade off differently.

Compile time is how long the engine spends turning bytes into machine code. With WebAssembly.compileStreaming the browser compiles while the body downloads, so a 2 MB module that takes 40 ms to download may finish compiling within a few milliseconds of the last byte arriving. A non-streaming compile(arrayBuffer) waits for the full download first, then compiles — strictly slower.

Instantiate time is binding the import object and running the start function; it is usually sub-millisecond unless your data section is huge or start does real work.

Throughput is the steady-state cost of the exported functions, the number you actually optimize against the JavaScript baseline. Measure it with performance.now() around a warmed loop.

These three costs pull in different directions, and conflating them produces bad decisions. A module that compiles in 30 ms but runs a hot loop for 200 ms per frame has a throughput problem, not a load problem — shaving the binary size will not help. Conversely, a small module that compiles instantly but is fetched on the critical path of first paint has a load problem best solved by streaming compilation and caching the compiled module, not by micro-optimizing the inner loop. Decide which number you are chasing before you change anything, because the levers barely overlap: compile time responds to binary size and streaming, instantiate time to data-section size and start work, and throughput to optimization level and boundary-crossing frequency.

const t0 = performance.now();
const module = await WebAssembly.compileStreaming(fetch("/pkg/app_bg.wasm"));
const t1 = performance.now();
const { instance } = await WebAssembly.instantiate(module, imports);
const t2 = performance.now();

// Warm the JIT, then time steady-state throughput.
for (let i = 0; i < 1000; i++) instance.exports.hot_path(i);
const t3 = performance.now();
for (let i = 0; i < 1_000_000; i++) instance.exports.hot_path(i);
const t4 = performance.now();

console.log(`compile ${(t1 - t0).toFixed(1)}ms · instantiate ${(t2 - t1).toFixed(1)}ms`);
console.log(`throughput ${((t4 - t3) / 1e6 * 1000).toFixed(3)}µs/call`);

For richer attribution, open the DevTools Performance panel, record a session, and read the flame chart. With debug info loaded, the chart labels Wasm frames with real function names instead of wasm-function[n], so a hot leaf shows up as decode_block rather than a numeric index. Look for wide Wasm frames (compute-bound) versus wide gaps between them (marshaling or JS overhead at the boundary). Crossing that boundary repeatedly with small payloads — rather than the compute itself — is frequently the real cost, which is why allocation strategy in linear memory management & allocators matters as much as the inner loop.

Reading the flame chart well takes a little practice. The horizontal axis is time, not call count, so a function that is wide is one you spent a lot of time in, regardless of how often it was called. A deep, narrow tower of frames is fine — that is just a call chain — but a wide, shallow frame at a leaf is your hot spot. Two patterns recur in Wasm profiles. First, a wide band of wasm-function self-time with no children is genuine compute; the fix is algorithmic or a better optimization level. Second, a sawtooth of narrow Wasm frames separated by JavaScript glue frames means you are paying per-call marshaling — the boundary, not the math, is the bottleneck, and batching more work per call is the lever. The Performance panel also records garbage-collection pauses on the JavaScript side; if your Wasm output is being copied into fresh Uint8Arrays every frame, you will see GC frames stealing time that profiling the Wasm alone would never reveal.

A subtlety worth internalizing: the sampling profiler attributes time to whichever frame was on the stack when it sampled, so very short functions can be under- or over-counted by statistical noise. Run a long enough recording (a few seconds of steady work) that the samples converge, and prefer the Bottom-Up view to aggregate self-time across every call site of a function. That aggregate is usually what you act on — it tells you the single function whose inner loop, if sped up, buys the most wall-clock back.


Optimization flags & tradeoffs: debug info versus binary size

Debug info is not free. A --dev Rust build with full DWARF can be 5–10× larger than the optimized release binary — a module that ships at 90 KB might balloon past 700 KB once .debug_info, .debug_line, and .debug_str are embedded. The breakdown:

Build Names DWARF Typical size Use for
wasm-pack build --release stripped stripped smallest production
wasm-pack build --release + keep-debug yes yes large profiling release perf
wasm-pack build --dev yes yes largest, unoptimized day-to-day debugging
emcc -O2 -g yes yes large optimized + debuggable

The practical workflow is to debug against a --dev or -g build and ship a stripped --release build. If you must profile release-level code (because --dev performance is misleadingly slow), build with optimizations and debug info, then strip the debug sections separately for the artifact you actually deploy with wasm-strip app.wasm. Never ship DWARF to users — it is dead weight the browser downloads and never executes.

There is a middle ground that many teams settle on. Ship a release binary with the name section intact but DWARF stripped. The name section adds only a small fraction to binary size yet makes production crash reports and any in-the-wild profiling intelligible, while the heavy .debug_* sections — which are what actually bloat the file — stay out of the download. You can achieve this by stripping selectively rather than wholesale:

# Drop the heavy DWARF sections but keep readable names in production.
wasm-strip --keep-section=name app_bg.wasm

Two further size levers interact with debug info. Running wasm-opt -O3 on a binary that still carries DWARF will either drop the debug sections or leave them stale, because optimization rewrites the code they describe — so optimize first, then attach or strip debug info, never the reverse. And gzip/Brotli at the CDN compresses DWARF reasonably well (it is repetitive text), but even compressed it is bytes your users wait on. The asymmetry is the whole point: debug info is for you, at your desk, and should rarely reach a browser you do not control.


Gotchas & failure modes

Most debugging failures are not mysterious — they are one of a small set of build or configuration mistakes that strip away the information DevTools needs. The list below covers the ones you will actually hit, each with the symptom that gives it away.

  • No name section, so every frame is numeric. A stripped binary shows wasm-function[23] in stack traces and $func23 in wasm-objdump. Rebuild with debug info, or at minimum keep names (--keep-debug / don’t run wasm-strip --keep-section name).
  • Breakpoints won’t bind in a release build. Optimization inlines and reorders code, so a source line may have no corresponding machine instruction. The breakpoint shows hollow/grey. Debug against an unoptimized build, or accept that some lines are unbreakable in -O2.
  • DWARF bloats the binary. See the table above — debug builds are several times larger. This is expected; it is the cost of source-level debugging. Strip for production.
  • Extension missing or DWARF experiment off. Symptoms: you see a wasm:// disassembly instead of your .rs file, or variables show as raw i32 with no names. Re-check both the experiment flag and the C/C++ extension.
  • Wrong MIME type breaks streaming. If the server returns text/plain for .wasm, instantiateStreaming throws and falls back to slower buffered compilation — and DevTools may not attach the source map. Serve application/wasm.
  • Memory views go stale after memory.grow. A Uint8Array you opened in the console detaches when the module grows memory; re-read instance.exports.memory.buffer before inspecting again.

Verification

Confirm the debug sections survived the build with wasm-objdump -h:

$ wasm-objdump -h pkg/app_bg.wasm

app_bg.wasm:  file format wasm 0x1

Sections:

     Type start=0x0000000b end=0x00000042 (size=0x00000037) count: 11
     ...
   Custom start=0x0004a1c2 end=0x0006f8a1 (size=0x0002d6df) ".debug_info"
   Custom start=0x0006f8a1 end=0x00081003 (size=0x00011762) ".debug_line"
   Custom start=0x00081003 end=0x000a44f2 (size=0x000234ef) ".debug_str"
   Custom start=0x000a44f2 end=0x000a51b8 (size=0x00000cc6) "name"

Seeing the name custom section means readable function names; seeing .debug_* means full source-level debugging is available. If only name is present you get line-level breakpoints but no typed variable inspection. Run wasm-validate pkg/app_bg.wasm to confirm the binary is still well-formed after any post-processing.


In this guide


Frequently Asked Questions

Why does my release build have no breakpoints or function names? Release optimization strips the name section and DWARF, and inlines code so source lines no longer map to instructions one-to-one. Build with wasm-pack build --dev or emcc -g for debugging, and ship the stripped release build separately.

Do I need the C/C++ DevTools extension for Rust? Yes — despite the name, the extension parses DWARF regardless of source language, so Rust binaries need it for typed variable inspection. Without it you still get line-level breakpoints from the name section and source map, but the Scope pane shows raw numeric locals.

How do I get a readable panic message from Rust? Call console_error_panic_hook::set_once() once at startup. It installs a panic hook that forwards the panic payload and a symbolicated backtrace to console.error, turning a bare unreachable trap into a real message.

What is the difference between DWARF and a source map here? DWARF is embedded in the .wasm as custom sections and carries full type and variable info but bloats the binary heavily. A source map is a separate .wasm.map file that maps offsets to source positions only — much smaller, but no variable types. DevTools supports both.

How do I profile only the steady-state cost and not JIT warm-up? Run a few thousand warm-up iterations of the exported function first, then time a large loop with performance.now(). The warm-up lets the engine’s optimizing tier kick in so your measurement reflects steady-state throughput, not first-call compilation.


← Back to WebAssembly Core Concepts & Browser Runtime