Skip to content

No encoding format documentation in spec #4

@SuperInstance

Description

@SuperInstance

Audit Finding

Severity: Medium
File(s): isa_unified.py, spec documentation

Problem

The ISA spec lists opcodes and references "Format A" through "Format G", but never formally defines what these formats are. There is no documentation of:

  • Byte layout for each format (which bits are opcode, registers, immediates)
  • Size of each format in bytes
  • Endianness of multi-byte values (big-endian? little-endian?)
  • Alignment requirements, if any
  • Maximum immediate value ranges per format

What Happens Because of This

Each runtime (VM, assembler, disassembler, LSP) re-implements encoding independently, leading to:

  • Subtle incompatibilities (see: spec vs VM mismatch issue)
  • No single reference to settle encoding disputes
  • New contributors must read source code to understand the binary format

Suggested Fix

Add a formal ENCODING_FORMATS.md (or section in the existing spec) that defines each format:

## Format Definitions

### Format A: 2-byte register-immediate
| Byte 0 | Byte 1 |
|--------|--------|
| opcode (8) | rd:rs (4:4) |

### Format B: 2-byte register-register  
| Byte 0 | Byte 1 |
|--------|--------|
| opcode (8) | rd:rs1:rs2 (4:2:2) |

... (C through G)

Include: bit widths, byte ordering, immediate ranges, and an example encoding for each format.

This should be the single source of truth that all runtimes reference.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions