Skip to content

Conversation

@ktock
Copy link
Owner

@ktock ktock commented Aug 26, 2025

No description provided.

ktock added 29 commits August 26, 2025 15:16
wasm64 target enables 64bit pointers using Emscripten's -sMEMORY64=1
flag[1]. This enables QEMU to run 64bit guests.

Although the configure script uses "uname -m" as the fallback value when
"cpu" is empty, this can't be used for Emscripten which targets to Wasm.
So, in wasm build, this commit fixes configure to require --cpu flag to be
explicitly specified by the user.

[1] https://emscripten.org/docs/tools_reference/settings_reference.html#memory64

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Currently there are some engines that don't support wasm64 (e.g. unsupported
on Safari[1]). To mitigate this issue, the configure script allows the user
to use Emscripten's compatibility feature, "-sMEMORY64=2" flag[2].

Emscripten's "-sMEMORY64=2" flag still enables 64bit pointers in C code. But
this flag lowers the output binary into wasm32, with limiting the maximum
memory size to 4GB. So QEMU can run on wasm32 engines.

[1] https://webassembly.org/features/
[2] https://emscripten.org/docs/tools_reference/settings_reference.html#memory64

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
This commit fixes Dockerfile of the wasm build to support both of wasm32 and
wasm64 build. Dockerfile takes the following build arguments and use these
values for building dependencies.

- TARGET_CPU: target wasm arch (wasm32 or wasm64)
- WASM64_MEMORY64: target -sMEMORY64 mode (1 or 2)

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
The wasm builds are tested for 3 targets: wasm32, wasm64(-sMEMORY64=1) and
wasm64(-sMEMORY64=2). The CI builds the containers using the same Dockerfile
(emsdk-wasm-cross.docker) with different build args.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The Wasm backend targets wasm64 as the host so TCG_TARGET_REG_BITS is set to
64. Since WebAssembly instructions vary in size and can include single-byte
instructions, TCG_TARGET_INSN_UNIT_SIZE is set to 1.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds the register allocation definitions and register names to
the Wasm backend. As in TCI, call arguments are stored on the stack buffer
and the return value is placed in the registers R0 and R1 when needed.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The Wasm backend integrates a forked TCI so its constraints are defined to
remain compatible with TCI.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Relocation callbacks are used for the TCI instructions to preserve the
original logic of the TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds support for generateing the and, or and xor operations. The
generated Wasm codes will be instantiated and executed in the browser.

Browsers tipycally limit the number of active Wasm instances and the
instantiating Wasm modules introduces overhead. As a result, instantiating
TBs that are rarely called is undesirable. To address this, the Wasm backend
relies on the a forked subset of the TCI interpreter (tcg_qemu_tb_exec_tci
function in tcg/wasm.c) for executing such TBs.

The Wasm backend emits both Wasm and TCI instructions. TCI instructions are
emitted to s->code_ptr, while the corresponding Wasm instructions are
generated into a separate buffer allocated via tcg_malloc(). This buffer
intends to be merged into the TB before tcg_gen_code returns.

In the Wasm code, each TCG variable is mapped to a 64bit Wasm
variable. Execution works by first pushing the operands into the Wasm's
stack using get instructions. The result is left on the stack and this can
be assigned to a variable by popping it using a set instruction. The Wasm
binary format is documented at [1].

Additionally, since the Wasm instuction's index operand must be
LEB128-encoded, this commit introduces an encoder function implemented
following [2].

[1] https://webassembly.github.io/spec/core/binary/index.html
[2] https://en.wikipedia.org/wiki/LEB128

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The add, sub and mul operations are implemented using the corresponding
instructions in Wasm. TCI instructions are also generated in the same way as
the original TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements the shl, shr and sar operations using Wasm
instructions. Since the Wasm backend uses 64bit variables, right shifts on
32bit values extract the lower 32bit of the operand before shifting. TCI
instructions are also generated in the same way as the original TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
These TCG instructions are implemented by using Wasm's if and else
instructions. TCI instructions are also generated in the same way as the
original TCI backend. Since support for TCG_COND_TSTEQ and TCG_COND_TSTNE is
not yet implemented, TCG_TARGET_HAS_tst is set to 0.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The sextract operation is genereted only when the corresponding Wasm
instructions are available, as specified by TCG_TARGET_sextract_valid.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Since Wasm load and store instructions don't support negative offsets,
address calculations are performed separately before the memory access.

When Emscripten's -sMEMORY64=2 is enabled, the address size must be
32bits. So this commit updates the build tools to propagate this flag to the
C code via the WASM64_MEMORY64_2 macro. In this case, the emitted code casts
pointers to 32bit before memory oprations.

Additionally, the declaration of "--wasm64-32bit-address-limit" flag has
been moved from the configure script to meson.build. So the flag name is
updated to "--enable-wasm64-32bit-address-limit" to follow Meson's naming
conventions.

TCI instructions are also generated in the same way as the original TCI
backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Theese operations are implemented using the corresponding instructions in
Wasm. TCI instructions are also generated in the same way as the original
TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The ext operations are implemented using the corresponding instructions in
Wasm. TCI instructions are also generated in the same way as the original
TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The div and rem operations are implemented using the corresponding
instructions in Wasm. TCI instructions are also generated in the same way as
the original TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The neg/ctpop operations are implemented using the corresponding
instructions in Wasm. TCI instructions are also generated in the same way as
the original TCI backend.

The Wasm backend implements only TCG_TARGET_REG_BITS=64 so the ctpop
instruction is generated only for 64bit operations, as declared in
cset_ctpop. Therefore, this commit adds only the 64bit version of ctpop
implementation.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
The rot/clz/ctz operations are implemented using the corresponding
instructions in Wasm. TCI instructions are also generated in the same way as
the original TCI backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Wasm does not support direct jumps to arbitrary code addresses, so br and
brcond are implemented using Wasm's control flow instructions.

As illustrated in the pseudo-code below, each TB wraps Wasm instructions
inside a large loop. Each set of codes separated by TCG labels is placed
inside an "if" block. Br is implemented by breaking out of the current block
and entering the target block:

loop
  if
    ... code after the first label
  end
  if
    ... code after the second label
  end
  ...
end

Each block is assigned an unique integer ID. The br implementation sets the
destination block's ID in BLOCK_IDX Wasm variable and breaks from the
current if block. As control flow continues, each if block checks whether
the BLOCK_IDX matches its own ID. If so, execution resumes within that
block.

The tcg_out_tb_start function generates the start of the global loop and the
first if block. To properly close these blocks, this commit also introduces
a new callback tcg_out_tb_end which emits the "end" instructions for the
final if block and the loop.

Another new callback tcg_out_label_cb is used to emit block boundaries,
specifically the end of the previous block and the if of the next block, at
label positions. It also records the mapping between label IDs and block IDs
in a LabelInfo list.

Since the block ID for a label might not be known when a br instruction is
generated, a placeholder is emitted instead. These placeholders are tracked
in a BlockPlaceholder list and resolved later using LabelInfo.

TCI instructions are also generated in the same way as the original TCI
backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
In the Wasm backend, each TB is compiled to a separeted Wasm module. Control
transfer between TBs (i.e. from one Wasm module to another) is handled by
the caller of the module.

The goto_tb and goto_ptr operations are implemented by returning control to
the caller using the return instruction. The destination TB's pointer is
passed to the caller via a shared WasmContext structure which is accessible
from both the Wasm module and the caller. This WasmContext must be provided
to the module as an argument.

If the destination TB is the current TB itself, there is no need to return
control to the caller. Instead, execution can jump directly to the top of
the loop within the TB.

The exit_tb operation sets the pointer in WasmContext to 0, indicating that
there is no destination TB.

TCI instructions are also generated in the same way as the original TCI
backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
To call QEMU functions from a TB's Wasm module, the functions must be
imported into the module.

Wasm's call instruction can invoke an imported function using a locally
assigned function index. When a call TCG operation is generated, the Wasm
backend assigns the ID (starting from 0) to the target function. The mapping
between the function pointer and its assigned ID is recorded in a list of
HelperInfo.

Since Wasm's call instruction requires arguments to be pushed onto the Wasm
stack, the backend retrieves the function arguments from TCG's stack array
and pushes them to the Wasm stack before the call. After the function
returns, the result is retrieved from the Wasm stack and set in the
corresponding TCG variable.

In the Emscripten build configured with !has_int128_type, a 128bit value is
represented by the Int128 struct. Such values are passed to the function via
pointer parameters and returned via a prepended pointer argument, as
described in [1]. For this prepended buffer area, the module expects a
pre-allocated Int128 buffer from the caller via ctx.buf128.

Helper functions expect the target of the return instruction via the GETPC
macro (the tci_tb_ptr variable in TCI). However, unlike other architectures,
Wasm doesn't have a register pointing to the return target. To emulate this
behaviour, the Wasm module sets the instruction pointer to the corresponding
TCI instruction (s->code_ptr) in tci_tb_ptr passed via the WasmContext.

TCI instructions are also generated in the same way as the original TCI
backend.

[1] https://github.com/WebAssembly/tool-conventions/blob/060cf4073e46931160c2e9ecd43177ee1fe93866/BasicCABI.md#function-arguments-and-return-values

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds qemu_ld and qemu_st by calling the helper functions
corresponding to MemOp.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit generates the mb operation. In Wasm, it uses the atomic.fence
instruction as the fence operator [1]. TCI instruction is also generated in
the same way as the original TCI backend using smp_mb().

[1] https://webassembly.github.io/threads/core/syntax/instructions.html#atomic-memory-instructions

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds the C_NotImplemented constraint and provides stubs for the
functions that aren't implemented in the Wasm backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds initialization of TCG_AREG0 and TCG_REG_CALL_STACK at the
beginning of each TB. The CPUArchState struct and the stack array are passed
from the caller via the WasmContext structure. The BLOCK_IDX variable is
initialized to 0 as TB execution begins at the first block.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit updates tcg_out_tb_start and tcg_out_tb_end to emit Wasm
binaries into the TB code buffer. The generated Wasm binary defines a
function of type wasm_tb_func which takes a WasmContext, executes the TB,
and returns a result. In the Wasm backend, each TB starts with a
WasmTBHeader which contains pointers to the following data:

- TCI code
- Wasm code
- Array of helper function pointers imported into the Wasm instance

tcg_out_tb_start writes the WasmTBHeader to the code buffer. tcg_out_tb_end
generates the full Wasm executable binary by creating the Wasm module header
following the spec[1][2][3] and copying the Wasm code body from sub_buf to
the TB. This Wasm binary is placed after the TCI code which was emitted
earlier.

Additionally, an array of imported function pointers is appended to the
TB. They are used during Wasm module instantiation. Function are imported to
Wasm with names like "helper.0", "helper.1", etc., where the number
corresponds to the array index.

Each function's type signature must also be encoded in the Wasm module
header. To support this, every emission of "call", "qemu_ld" and "qemu_st"
operations also records the target function's type information in a buffer
which will be copied to the code buffer during tcg_out_tb_end.

Memory is shared between QEMU and the TBs and is imported to the Wasm module
with the name "env.memory".

[1] https://webassembly.github.io/spec/core/binary/modules.html
[2] https://github.com/WebAssembly/threads/blob/b2567bff61ee6fbe731934f0ed17a5d48dc9ab01/proposals/threads/Overview.md
[3] https://github.com/WebAssembly/memory64/blob/9003cd5e24e53b84cd9027ea3dd7ae57159a6db1/proposals/memory64/Overview.md

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
instantiate_wasm is a function that instantiates a TB's Wasm binary,
importing the functions as specified by its arguments. Following the header
definition in wasm/tcg-target.c.inc, QEMU's memory is imported into the
module as "env.memory", and helper functions are imported as "helper.<idx>".

The instantiated Wasm module is imported to QEMU using Emscripten's
"addFunction" feature[1] which returns a function pointer. This allows QEMU
to call this module directly from C code via that pointer.

Since the subarray() method doesn't accept a BigInt value which is used for
the 64bit pointer value, it is converted to a Number (i53) using
bigintToI53Checked method of Emscripten. Although this conversion (64bit to
53bit) drops higher bits, the maximum memory size of the engine
implementations is currently limited to 16GiB[2] so we can assume that the
pointers are within the Number's range.

Note that since FireFox 138, WebAssembly.Module no longer accepts a
SharedArrayBuffer as input [3] as reported by Nicolas Vandeginste in my
fork[4]. This commit ensures that WebAssembly.Module() is passed a
Uint8Array created from the binary data on a SharedArrayBuffer.

[1] https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#calling-javascript-functions-as-function-pointers-from-c
[2] https://webassembly.github.io/memory64/js-api/#limits
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1965217
[4] #25

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Emscripten's Fiber coroutine implements coroutine switching using Asyncify's
stack unwinding and rewinding features [1]. When a coroutine yields
(i.e. switches out), Asyncify unwinds the stack, returning control to
Emscripten's JS code (Fiber.trampoline()). Then execution resumes in the
target coroutine by rewinding the stack. Stack unwinding is implemented by a
sequence of immediate function returns, while rewinding re-enters the
functions in the call stack, skipping any code between the function's entry
point and the original call position [2].

This commit updates the TB's Wasm module to allow helper functions to
trigger coroutine switching. Particaully, the TB handles the unwinding and
rewinding flows as follows:

- The TB check the Asyncify.state JS object after each helper call. If
  unwinding is in progress, the TB immediately returns to the caller so that
  the unwinding can continue.
- Each function call is preceded by a block boundary and an update of the
  BLOCK_IDX variable. This enables rewinding to skip any code between the
  function's entry point and the original call position.

Additionally, this commit introduces WasmContext.do_init which is a flag
indicating whether the TB should reset the BLOCK_IDX variable to 0
(i.e. start from the beginning). call_wasm_tb is a newly introduced wrapper
function for the Wasm module's entrypoint and this sets "do_init = 1" to
ensure normal TB execution begins at the first block. During a rewinding,
the C code does not set do_init to 1, allowing the TB to preserve the
BLOCK_IDX value from the previous unwinding and correctly resume execution
from the last unwound block.

[1] https://emscripten.org/docs/api_reference/fiber.h.html
[2] https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html#new-asyncify

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit enables the instantiation and execution of TBs in wasm.c. As in
TCI, the tcg_qemu_tb_exec function serves as the entrypoint for the TB
execution, handling both instantiation and invocation of the Wasm
module. Since browsers cause out of memory error if too many Wasm instances
are created, this commit restricts instantiation to TBs that are called many
times.

This commit adds a counter (or its array if there are multiple threads) to
the TB. Each time a TB is executed on TCI, the counter on TB is
incremented. If it reaches to a threshold, that TB is instantiated as Wasm
via instantiate_wasm.

The total number of instances are tracked by the instances_global variable
and its maximum number is limited by MAX_INSTANCES. When a Wasm module is
instantiated, instances_global is incremented and the instance's function
pointer is recorded to an array of WasmInstanceInfo.

Each TB refers to the WasmInstanceInfo entry via WasmTBHeader's info_ptr (or
its array if there are multiple threads). This allows tcg_qemu_tb_exec to
resolve the instance's function pointer from the TB.

When a new instantiation would exceed the limit, the Wasm backend doesn't
perform instantiation (i.e. TB continues execution on TCI). Instead, it
triggers the removal of older Wasm instances using Emscripten's
removeFunction function. Once the removal is completed and detected via
FinalizationRegistry API[1], instances_global is decremented, allowing new
modules to be instantiated.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit enables Wasm module's qemu_ld and qemu_st to perform TLB
lookups, following the approach used in other backends such as
RISC-V. Unlike other backends, the Wasm backend cannot use ldst labels, as
jumping to specific code addresses (e.g. raddr) is not possible in
Wasm. Instead, each TLB lookup is followed by a if branch: if the lookup
succeeds, the memory is accessed directly; otherwise, a fallback helper
function is invoked. Support for MO_BSWAP is not yet implemented, so
has_memory_bswap is set to false.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds tcg_target_init, aligning it with the Wasm backend's
register and stack usage.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Enable to use tcg/wasm as the TCG backend for the WebAssembly (wasm64)
build.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Emscripten uses the optimization flag at the link time to enable
optimizations via Binaryen [1]. While meson.build currently recognizes the
-Doptimization option, it does not propagate it to the linking. This commit
updates meson.build to propagate the optimization flag to the linking when
targeting WebAssembly.

[1] https://emscripten.org/docs/optimizing/Optimizing-Code.html#how-emscripten-optimizes

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds the build tests for the wasm backend.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@ktock ktock closed this Sep 30, 2025
@ktock ktock reopened this Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants