TinyCompiler is a fully functional educational compiler written in ~200 lines of JavaScript. It transforms LISP-like function call syntax into C-like function calls, demonstrating all major compiler phases in an easy-to-understand way.
- โ Complete compiler pipeline (Tokenizer โ Parser โ Transformer โ Code Generator)
- โ Educational & well-commented code
- โ Web playground with live preview
- โ REST API for compilation
- โ Fully tested with comprehensive test suite
- โ MIT Licensed
This project serves as a learning tool to understand how compilers work by breaking down complex compilation concepts into digestible, well-commented code.
- Name: TinyCompiler
- Version: 1.0.0
- Author: Ovi Shekh hi@ovishekh.com (ovishekh.com)
- License: MIT
- Repository: github.com/ovishkh/TinyCompiler
- Live Demo: TinyCompiler.ovishekh.com
- Main Entry Point:
./TinyCompiler.js
The compiler follows the standard three-phase architecture:
Converts raw input code into an Abstract Syntax Tree (AST)
- Lexical Analysis (Tokenization): Breaks code into tokens
- Syntactic Analysis: Builds AST from tokens
Converts the source AST to a target AST using the visitor pattern
- Traverses the AST depth-first
- Applies transformations via visitor methods
- Creates a new AST for target language
Converts the transformed AST back into code strings
- Recursively prints each node type
- Generates final output code
(add 2 (subtract 4 2))add(2, subtract(4, 2));Input String
โ
[Tokenizer] โ Tokens Array
โ
[Parser] โ Abstract Syntax Tree (LISP AST)
โ
[Traverser + Visitor] โ Transformed AST (C-like AST)
โ
[Code Generator] โ Output String
Purpose: Lexical analysis - breaks raw code into tokens
Input: String of code
"(add 2 (subtract 4 2))"Output: Array of tokens
[
{ type: 'paren', value: '(' },
{ type: 'name', value: 'add' },
{ type: 'number', value: '2' },
{ type: 'paren', value: '(' },
{ type: 'name', value: 'subtract' },
{ type: 'number', value: '4' },
{ type: 'number', value: '2' },
{ type: 'paren', value: ')' },
{ type: 'paren', value: ')' }
]Token Types Recognized:
paren: Parentheses(and)number: Numeric literals[0-9]+string: String literals enclosed in""name: Function names and identifiers[a-zA-Z]+
Key Algorithm:
- Uses cursor-based iteration (
currentpointer) - Handles multi-character tokens (numbers, names, strings)
- Skips whitespace
- Throws error on unrecognized characters
Purpose: Syntactic analysis - builds AST from tokens
Input: Array of tokens
Output: Abstract Syntax Tree
{
type: 'Program',
body: [{
type: 'CallExpression',
name: 'add',
params: [{
type: 'NumberLiteral',
value: '2'
}, {
type: 'CallExpression',
name: 'subtract',
params: [{
type: 'NumberLiteral',
value: '4'
}, {
type: 'NumberLiteral',
value: '2'
}]
}]
}]
}Node Types Generated:
Program: Root node containing body arrayCallExpression: Function invocation with name and paramsNumberLiteral: Numeric valuesStringLiteral: String values
Key Algorithm:
- Recursive descent parser using
walk()function - Maintains cursor (
current) position - Handles nested expressions recursively
- Program body supports multiple top-level expressions
Purpose: Depth-first tree traversal with visitor pattern
Key Features:
- Visits every node in the AST
- Supports both
enterandexitcallbacks - Maintains parent-child relationships
- Enables non-destructive AST analysis
Traversal Order (Depth-First):
โ Program (enter)
โ CallExpression (enter)
โ NumberLiteral (enter)
โ NumberLiteral (exit)
โ CallExpression (enter)
โ NumberLiteral (enter)
โ NumberLiteral (exit)
โ NumberLiteral (enter)
โ NumberLiteral (exit)
โ CallExpression (exit)
โ CallExpression (exit)
โ Program (exit)
Visitor Pattern Structure:
{
NodeType: {
enter(node, parent) { /* ... */ },
exit(node, parent) { /* ... */ }
}
}Purpose: Converts LISP AST to C-like AST
Transformation Rules:
| LISP AST | โ | C-like AST |
|---|---|---|
CallExpression with name |
โ | CallExpression with Identifier callee |
Top-level CallExpression |
โ | Wrapped in ExpressionStatement |
| Direct params array | โ | Arguments array |
NumberLiteral |
โ | NumberLiteral (unchanged) |
StringLiteral |
โ | StringLiteral (unchanged) |
Input AST (LISP-like):
{
type: 'CallExpression',
name: 'add',
params: [...]
}Output AST (C-like):
{
type: 'ExpressionStatement',
expression: {
type: 'CallExpression',
callee: {
type: 'Identifier',
name: 'add'
},
arguments: [...]
}
}Key Features:
- Uses
_contexthack for parent-to-child linking - Wraps top-level expressions in
ExpressionStatement - Preserves nested structure
- Non-destructive (creates new AST)
Purpose: Converts AST back into executable code strings
Output Generation:
| Node Type | Output Format |
|---|---|
Program |
Maps and joins body with newlines |
ExpressionStatement |
Expression + ; |
CallExpression |
callee(arg1, arg2, ...) |
Identifier |
Function name string |
NumberLiteral |
Number value |
StringLiteral |
"string value" |
Example:
// Input node:
{
type: 'CallExpression',
callee: { type: 'Identifier', name: 'add' },
arguments: [
{ type: 'NumberLiteral', value: '2' },
{ type: 'NumberLiteral', value: '3' }
]
}
// Output string:
// "add(2, 3)"Purpose: Main orchestration function linking all phases
Pipeline:
function compiler(input) {
let tokens = tokenizer(input); // Phase 1a: Lexical Analysis
let ast = parser(tokens); // Phase 1b: Syntactic Analysis
let newAst = transformer(ast); // Phase 2: Transformation
let output = codeGenerator(newAst); // Phase 3: Code Generation
return output;
}Usage:
const result = compiler('(add 2 (subtract 4 2))');
console.log(result); // "add(2, subtract(4, 2));"File: test.js
Test Coverage:
- โ Tokenizer test - Verifies token generation
- โ Parser test - Verifies AST structure
- โ Transformer test - Verifies AST transformation
- โ Code Generator test - Verifies output generation
- โ Compiler integration test - Verifies end-to-end compilation
Running Tests:
node test.jsExpected Output:
All Passed!
- Character-by-character scanning
- Multi-character token handling
- Whitespace handling
- Error detection
- Building AST from tokens
- Handling nested structures
- Recursive function calls
- Tree traversal abstraction
- Enter/exit callbacks
- Separation of concerns
- Pattern matching on node types
- Creating new AST structures
- Context propagation
- Recursive node processing
- String concatenation
- Output formatting
For beginners: Read through the code in this order:
- Tokenizer - Start here to understand lexical analysis
- Parser - Learn recursive descent parsing
- Traverser - Understand tree traversal patterns
- Transformer - See practical AST manipulation
- Code Generator - Learn output generation
- Compiler - See how it all ties together
Recommended approach:
- Read the extensive inline comments in
TinyCompiler.js - Trace through the example in
test.js - Try modifying the input to see how it transforms
- Add new token types or node types
The compiler can be extended to support:
- More operators: Add to tokenizer (e.g.,
+,-,*,/) - More token types: Update tokenizer regex patterns
- More node types: Add cases to parser, traverser, and generator
- Different target languages: Modify transformer and generator
- Type checking: Add validation in transformer
- Optimization: Add optimization passes before code generation
The concepts here are used in:
- Language compilers (JavaScript, Python, Rust, etc.)
- Build tools (Webpack, Babel, TypeScript)
- Template engines (Handlebars, EJS)
- Query languages (GraphQL, SQL parsers)
- Domain-specific languages (Configuration files, markup)
MIT License
Copyright (c) 2025 Ovi Shekh
This project is licensed under the MIT License - see the LICENSE file for details.
You are free to use, modify, and distribute this project with or without attribution (though attribution is appreciated).
Created by: Ovi Shekh (ovishekh.com)
Special Thanks:
- Md. Rashedul Alam - For guidance and support
- James Kyle - For compiler design inspiration
- Gates Smasher - For valuable contributions
- Corrado Bรถhm - For theoretical foundations
- Node.js v14 or higher
- npm or yarn
# Clone the repository
git clone https://github.com/ovishkh/TinyCompiler.git
cd TinyCompiler
# Install dependencies
npm install
# Run tests
npm test
# or
make testconst { compiler } = require('./TinyCompiler');
// Simple example
const input = '(add 2 2)';
const output = compiler(input);
console.log(output); // "add(2, 2);"
// Nested example
const complex = '(add 2 (subtract 4 2))';
console.log(compiler(complex)); // "add(2, subtract(4, 2));"Visit TinyCompiler.ovishekh.com for an interactive playground.
# Compile via HTTP API
curl "https://TinyCompiler.ovishekh.com/api/compiler?code=(add%201%202)"make install # Install dependencies
make test # Run tests
make run # Run with example
make dev # Development mode
make clean # Clean dependencies
make help # Show all commands# Build Docker image
docker build -t tinycompiler .
# Run container
docker run -p 3000:3000 tinycompilerThe project is configured for easy deployment to Vercel:
# Install Vercel CLI
npm i -g vercel
# Deploy
vercel
# Deploy to production
vercel --prodThe vercel.json file is pre-configured with:
- API routes at
/api/compiler - Static file serving
- CORS headers enabled
The project includes:
vercel.json- Vercel configurationapi/compiler.js- Serverless API endpointindex.html- Web playground UI
TinyCompiler/
โโโ TinyCompiler.js # Main compiler (all phases)
โโโ test.js # Test suite
โโโ index.html # Web playground UI
โโโ api/
โ โโโ compiler.js # REST API endpoint
โโโ vercel.json # Vercel config
โโโ Makefile # Build automation
โโโ package.json # Dependencies
โโโ LICENSE # MIT License
โโโ README.md # This file
โโโ CONTRIBUTING.md # Contribution guide
npm test # Run all tests
make test # Using Makefile- 2-space indentation
- Use
'use strict'; - Comprehensive inline comments
- Descriptive variable names
See CONTRIBUTING.md for detailed contribution guidelines.
Q: Why doesn't it support operators like +?
A: This compiler demonstrates core concepts with simple syntax. Operators would require additional tokenizer rules.
Q: Can I use this for real compilation? A: This is an educational tool. Production compilers have significantly more complexity for optimization, error recovery, and language features.
Q: How do I add new features? A: Follow this pattern: Update tokenizer โ Update parser โ Update transformer โ Update generator โ Add tests.
To deepen your compiler knowledge, explore:
- Lexical Analysis: Finite automata, regular expressions
- Parsing: Context-free grammars, parse trees
- Semantic Analysis: Symbol tables, type checking
- Optimization: Dead code elimination, constant folding
- Code Generation: Register allocation, instruction selection
- ๐ Website: ovishekh.com
- ๐ป GitHub: github.com/ovishkh/TinyCompiler
- ๐ฎ Live Demo: TinyCompiler.ovishekh.com
- ๐ API Docs: GitHub Wiki
- The Super Tiny Compiler - Original by James Kyle
- Crafting Interpreters - Interactive guide
- Engineering a Compiler - Advanced reading
- Lexical Analysis: Finite automata, regular expressions
- Parsing: Context-free grammars, parse trees, recursive descent
- Semantic Analysis: Symbol tables, type checking
- Optimization: Dead code elimination, constant folding
- Code Generation: Register allocation, instruction selection
MIT License ยฉ 2025 Ovi Shekh
See LICENSE file for details.
- Issues & Bugs: GitHub Issues
- Discussions: GitHub Discussions
- Email: hi@ovishekh.com
- Website: ovishekh.com
Last Updated: November 21, 2025
Made with โค๏ธ by Ovi Shekh