Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Tiny Compiler Architecture - Code Generation

This document describes the architecture of the code generation pipeline in the Tiny compiler.

## Overview

The Tiny compiler follows a traditional multi-stage pipeline to transform source code into an executable binary. The code generation phase specifically handles the transformation from Intermediate Representation (IR) to ARM64 assembly and finally to a machine-code binary.

## Code Generation Pipeline

```mermaid
graph TD
subgraph Frontend
Source[Tiny Source Code] --> Reader[FileReader]
Reader --> Tok[Tokenizer]
Tok --> Par[Parser]
end

subgraph "Middle-end (IR & Optimization)"
Par --> IRB[IrBuilder]
IRB --> IR[Intermediate Representation]
IR --> Alloc[Allocator]
Alloc --> RegAlloc[Register Allocation]
end

subgraph "Backend (Code Generation)"
IR --> GenCtx[GenContext]
RegAlloc --> GenCtx

subgraph "GenContext Internal Process"
GenCtx --> StackMap[Build Stack Map]
StackMap --> PhiProc[Process Phis]
PhiProc --> InstrGen[Insert Instructions]
InstrGen --> AsmMerge[Merge with Platform Base ASM]
end

AsmMerge --> ASM[ARM64 Assembly]
end

subgraph "Binary Compilation"
ASM --> Assemble[Assembler - as]
Assemble --> Obj[Object File]
Obj --> Link[Linker - ld/gcc]
Link --> Bin[Executable Binary]
end

%% Data Structures
ProgramContext[(ProgramContext)] -.-> IR
AllocationGroup[(AllocationGroup)] -.-> RegAlloc
```

## Key Components

### 1. GenContext (`tiny/src/codegen/arm64/gen.rs`)
The `GenContext` is the central component of the code generator. It maintains the state necessary to translate IR blocks into ARM64 instructions.

- **Stack Management**: `build_stack_map` determines the stack layout for values that were spilled during register allocation.
- **Phi Resolution**: `process_phis` handles the transition of values between basic blocks by identifying "join-want-lists" for Phi nodes.
- **Instruction Generation**: `insert_instrs` recursively traverses the IR's basic blocks and emits corresponding ARM64 assembly.
- **Platform Support**: Supports both `Apple` and `Linux` ARM64 targets by using different base assembly templates (`apple_base.s` and `linux_base.s`).

### 2. Register Allocation (`tiny/src/register_allocation/`)
Before code generation, the `Allocator` assigns virtual IR values to physical registers (X0-X30) or spills them to the stack. The resulting `AllocationGroup` is used by `GenContext` to emit the correct register operands.

### 3. Binary Compilation (`tiny/src/codegen/bin_compile.rs`)
Once the assembly string is generated, this module handles the final steps:
- **Assemble**: Invokes the system assembler (`as`) to create object files.
- **Link**: Invokes the system linker (`ld` on macOS, `gcc` on Linux) to produce the final executable, handling entry point (`-e bb1`) and system library linking.

## Data Flow

1. **IR Input**: The `ProgramContext` contains a collection of functions and their basic blocks.
2. **Register Mapping**: For each instruction, `GenContext` looks up the assigned physical register from the `AllocationGroup`.
3. **Spill Handling**: If a value is marked as spilled, `GenContext` generates `STR` (store) and `LDR` (load) instructions to move data between registers and the stack.
4. **Control Flow**: IR branch and fall-through relationships are converted into assembly labels and branch instructions (`B`, `B.EQ`, etc.).
Loading