NotPlusPlus Design Document

1. Document Purpose

This document is the authoritative design for NotPlusPlus, a source-code interpreter for a small, explicit, well-defined subset of real C++.

NotPlusPlus is not a new language with C++-like syntax. It is a system that accepts actual C++ source text and interprets programs whose constructs fall entirely within a supported subset of the C++ language. Programs outside that subset are rejected with precise diagnostics.

This document defines:

the product goal
the supported language subset
semantic rules
architecture
internal representations
execution model
diagnostic behavior
implementation plan
testing strategy
explicit non-goals
future evolution constraints

2. Product Definition

2.1 Product name

NotPlusPlus

2.2 Core objective

NotPlusPlus shall:

accept source code written in genuine C++ syntax
lex and parse it according to a subset-compatible grammar
resolve declarations and names according to subset-compatible C++ rules
type-check the program according to the subset’s semantics
interpret the program directly without producing native machine code
execute a well-formed main function and produce observable program output

2.3 Product positioning

NotPlusPlus is a subset C++ interpreter, not a compiler, transpiler, static analyzer, or language invention exercise.

The correct mental model is:

“Interpret actual C++ source that belongs to a strict, documented subset of ISO C++.”

This distinction is critical. The parser and semantic rules must align with real C++ wherever the subset overlaps the language, rather than inventing alternate rules for convenience.

2.4 Primary use case

A user writes a small C++ program using only supported constructs, for example:

int add(int a, int b) {
    return a + b;
}

int main() {
    int x = add(2, 3);
    if (x > 4) {
        print(x);
    }
    return 0;
}

NotPlusPlus parses and interprets this program according to the documented subset semantics.

2.5 Non-goal framing

NotPlusPlus shall not attempt to “mostly parse C++” or “best-effort emulate unsupported constructs.” Unsupported features are not partially recognized and ignored. They are rejected.

This project succeeds by being:

strict
explicit
deterministic
semantically coherent
faithful where supported

3. Design Principles

3.1 Real C++ first

If a construct is supported, its spelling and semantics must correspond to real C++ as closely as practical within the subset.

3.2 Explicit subset contract

Every accepted construct must be documented. Everything else is unsupported.

3.3 No silent fallback semantics

Unsupported constructs shall not be reinterpreted under custom rules.

Bad:

accepting std::cout << x; and secretly treating it as print(x);

Good:

either support real semantics for a narrow form of expression involving std::cout, or reject it

For version 1, the recommended design is to reject std::cout entirely and provide a built-in function print(...) defined as part of the interpreter’s runtime environment, because a normal function call is valid C++ syntax.

3.4 Parseability and semantic tractability

Subset selection must deliberately avoid notorious C++ ambiguities and front-end complexity where possible.

3.5 Deterministic behavior

Given the same source and interpreter build, behavior must be stable and reproducible.

3.6 Diagnostics as part of the product

A rejected program must receive actionable diagnostics with source locations and stable error categories.

3.7 Layered pipeline

The implementation shall be staged:

source management
lexing
preprocessing boundary handling
parsing
AST formation
semantic analysis
interpretation

No phase may embed undocumented behavior from a later phase unless explicitly designed.

4. Scope Summary

4.1 Supported in version 1

The initial supported subset shall include:

translation unit structure

a single translation unit
zero or more function definitions
optional function declarations
no separate compilation
no headers beyond a limited interpreter-provided prelude model

types

int
bool
void
fixed-size one-dimensional arrays of supported element type, optionally deferred to phase 2

expressions

integer literals
boolean literals true and false
identifier references
parenthesized expressions
unary operators: +, -, !
binary arithmetic: +, -, *, /, %
binary comparison: <, <=, >, >=, ==, !=
logical operators: &&, ||
assignment: =
compound assignment: +=, -=, *=, /=, %=
function call
array indexing if arrays are enabled
comma operator is unsupported in expressions except where grammar requires comma separators

statements

expression statement
declaration statement
compound statement / block
if
if / else
while
for
return
break
continue

declarations

local variable declarations
function declarations and definitions
block scope
function parameter declarations
optional local array declarations

runtime/library surface

built-in function print(int)
built-in function print(bool) optional
built-in function println(int) optional
built-ins shall be ordinary function names in the global namespace from the program's perspective

execution

interpret starting from int main()
allow int main() and possibly int main(int, bool) only if intentionally designed; default is only int main()

4.2 Excluded from version 1

preprocessing beyond minimal policy handling
macros
includes with real header loading
namespaces
classes
structs
enums
references
pointers
dynamic allocation
strings
floating point
character types
casts
overload resolution beyond built-in support model
templates
exceptions
function pointers
lambdas
recursion limits beyond implementation-defined stack protection
user-defined operators
declarations requiring full declarator complexity
switch
do-while
const, constexpr, static, extern, volatile, mutable
global variables, at least in version 1 baseline

5. Supported Language Definition

This section is normative.

5.1 Source model

The input to NotPlusPlus is a UTF-8 text file treated as a single C++ source file.

The implementation may restrict accepted characters to ASCII plus standard whitespace for version 1.

Line endings:

\n mandatory support
\r\n normalized support recommended

5.2 Translation units

A program consists of a sequence of top-level declarations. In version 1:

Allowed top-level declarations:

function declaration
function definition
optional built-in declaration injection, performed by interpreter before semantic analysis

Disallowed at top level:

variable definitions
namespace declarations
type definitions
using directives
class/struct definitions
templates
include directives unless a special preprocessing policy is adopted

5.3 Keywords

The supported keyword set includes:

int
bool
void
if
else
while
for
return
true
false

All other C++ keywords are lexed as keywords if the lexer supports them globally, but any occurrence in syntax outside the supported grammar is rejected as unsupported.

5.4 Comments

Support:

// line comment
/* block comment */

Nested block comments are not supported, matching C/C++ behavior.

5.5 Literals

Supported literals:

decimal integer literals, non-suffixed
true
false

Unsupported:

hexadecimal
binary
octal
digit separators
integer suffixes
character literals
string literals
floating literals
user-defined literals

The integer literal domain shall be bounded by the interpreter integer representation. Recommended baseline: signed 32-bit two’s-complement semantics.

5.6 Types

5.6.1 Fundamental types

Supported:

int
bool
void

5.6.2 Arrays

Optional in version 1, but strongly recommended only after core stability.

Supported form:

int a[5];
bool flags[10];

Constraints:

one-dimensional only
size must be a positive integer literal
no variable-length arrays
no array parameters by special adjustment unless explicitly modeled
no decay to pointer semantics
no initializer lists in version 1 baseline

5.6.3 Type system rules

void may only appear as a function return type
variables and parameters may not have type void
arrays may not have element type void
if arrays are supported, assignment between arrays is disallowed

5.7 Declarators

To avoid full C++ declarator complexity, the subset supports only a reduced set of declarator forms.

5.7.1 Variable declarators

Allowed:

int x;
int x = 5;
bool done = false;
int arr[5]; if arrays enabled

Disallowed:

multiple declarators in one declaration, e.g. int a, b;
pointer declarators
reference declarators
parenthesized declarators
initialized arrays
declarators with qualifiers

5.7.2 Function declarators

Allowed:

int f(int a, int b);
int f(int a, int b) { ... }
void g();
bool h(bool x) { ... }

Disallowed:

default parameters
variadic parameters
function overloading
member functions
trailing return types
noexcept, attributes, requires clauses
cv/ref qualifiers
templates

5.8 Statements

5.8.1 Compound statement

Supported:

{
    int x = 1;
    x = x + 1;
}

Each block creates a new lexical scope.

5.8.2 Declaration statement

A declaration statement is a supported local variable declaration followed by ;.

5.8.3 Expression statement

Any supported expression followed by ;.

5.8.4 If statement

Supported:

if (cond) stmt
if (cond) stmt else stmt

Condition must be of type bool, or int if integer-to-bool contextual conversion is allowed by policy. Recommended baseline: allow both, matching C++ contextual conversion to bool.

5.8.5 While statement

Supported:

while (cond) stmt

5.8.6 For statement

Supported:

for (init; cond; step) stmt

Version 1 recommended support:

init may be empty, an expression statement without trailing semicolon inside the syntax, or a single supported variable declaration
cond may be empty or a supported expression
step may be empty or a supported expression

Examples:

for (int i = 0; i < 10; i = i + 1) { ... }
for (; x < 10; ) { ... }

5.8.7 Return statement

Supported:

return;
return expr;

Rules:

return; only valid in void functions
return expr; required for non-void functions
expression type must be convertible to function return type according to subset rules

5.9 Expressions

5.9.1 Primary expressions

Supported:

identifier
integer literal
true
false
parenthesized expression
function call
array subscript if arrays enabled

5.9.2 Unary expressions

Supported:

+expr
-expr
!expr

Unsupported:

++
--
*
&
sizeof
new
delete
static_cast
C-style cast
~

5.9.3 Binary expressions

Supported:

multiplicative: * / %
additive: + -
relational: < <= > >=
equality: == !=
logical and: &&
logical or: ||
assignment: =
compound assignment: += -= *= /= %=

Unsupported:

bitwise operators
shifts
comma operator
member access
pointer-to-member
spaceship operator

5.9.4 Assignment

Assignment is supported only for assignable lvalues:

variable reference
array element if arrays enabled

Unsupported:

chained assignment if parser naturally accepts it via right associativity is allowed only if semantic rules support it; recommended baseline: support it because it follows normal assignment-expression grammar, but it is not required as a primary advertised feature

5.9.5 Function call

Calls to declared functions are supported.

Rules:

exact arity match required
argument types must be compatible
no overload resolution
no implicit function declarations

5.9.6 Short-circuit behavior

&& and || must short-circuit exactly as in C++.

This is semantically important and non-negotiable.

6. Semantic Rules

This section defines the runtime-visible and compile-time-visible semantics of the supported subset.

6.1 Name lookup and scopes

6.1.1 Scope kinds

The interpreter shall model at least:

global function scope
function parameter scope
block scope
for-init scope if declaration form is used

6.1.2 Variable lookup

Variables are resolved lexically, innermost scope first.

6.1.3 Shadowing

Shadowing is allowed across nested scopes.

Example:

int main() {
    int x = 1;
    {
        int x = 2;
        print(x); // 2
    }
    print(x); // 1
    return 0;
}

6.1.4 Redeclaration

Two variables with the same name in the same scope are rejected.

Function declarations may be repeated only if identical in signature and kind. Because overloading is unsupported, any differing function signature with the same name is an error.

6.2 Type system

6.2.1 Static typing

The subset is statically typed. Types are determined during semantic analysis.

There is no dynamic typing or value-tag-based operator selection beyond what is already statically determined.

6.2.2 Supported implicit conversions

A policy choice is required here. The recommended baseline is:

Allowed:

int to bool in contextual conversions only, such as conditions and logical operators where C++ would require a bool-like condition
bool to int for arithmetic contexts only if explicitly aligned with C++ integral promotion semantics

However, to simplify implementation while staying faithful enough, version 1 should adopt:

Recommended v1 rule set

int and bool are distinct types
arithmetic operators require int
comparison operators on int produce bool
equality operators support int == int and bool == bool
logical operators require operands contextually convertible to bool
conditions (if, while, for) require expression contextually convertible to bool
assignment requires exact type match, except possibly bool = int and int = bool if a limited conversion matrix is adopted

For design clarity, the strictest consistent version is preferred:

Strict baseline

exact-type assignment only
contextual bool conversion allowed from bool and int
no other implicit conversions

This gives useful C++ fidelity without opening a large conversion lattice.

6.3 Initialization

6.3.1 Local variables

Supported:

default initialization without initializer
copy initialization with = expr

For simplicity and defined behavior, the interpreter should not mimic uninitialized local scalar UB in version 1. Instead choose one of:

strict C++-style UB model for uninitialized reads, detected dynamically
explicit interpreter rule: reading an uninitialized variable is a runtime error

Recommended:

every variable has an initialized flag
declarations without initializer create uninitialized storage
reading before initialization is a runtime error with source location

This is closer to a practical interpreter and still semantically honest.

6.3.2 Arrays

If supported:

elements default to uninitialized
element read before write is a runtime error
zero-initialization syntax is unsupported in version 1

6.4 Operator semantics

6.4.1 Integer arithmetic

Operations are performed on interpreter integers. Recommended baseline semantics:

32-bit signed range
overflow is a runtime error, or implementation-defined wraparound

This needs a deliberate choice because real signed overflow in C++ is UB.

Recommended v1 choice:

detect overflow and raise runtime error

Rationale:

deterministic
easier to debug
safer
acceptable for interpreter-defined handling of UB-like conditions

Document this explicitly:

NotPlusPlus does not reproduce all undefined behavior of full C++. Certain UB-prone operations are trapped deterministically at runtime.

This is acceptable because the subset is “real C++ syntax and semantics” only within a constrained executable model; UB emulation is not required.

6.4.2 Division and modulo

Division by zero and modulo by zero are runtime errors.

6.4.3 Logical operators

Operands are contextually converted to bool. Evaluation short-circuits.

6.4.4 Comparison

int relational comparison is supported
bool relational comparison is unsupported unless explicitly added; recommended baseline: disallow except equality
equality on same-type operands is supported

6.4.5 Assignment

Assignment evaluates RHS, converts if allowed, stores value, and yields the assigned value if assignment expressions are expressions in the grammar.

6.5 Control flow semantics

6.5.1 If

Condition evaluated once. Then branch chosen accordingly.

6.5.2 While

Standard loop semantics.

6.5.3 For

Equivalent to C++ subset semantics, not a purely internal custom loop type. The interpreter may implement by direct execution or desugaring.

If desugared, it must preserve:

init scope
condition evaluation timing
step evaluation timing
block scoping behavior

6.5.4 Return

A return transfers control immediately to the caller.

Reaching end of function:

for void function: allowed, implicit return
for non-void function other than main: semantic error or runtime error

Recommended:

semantic analysis requires that a non-void function contains at least one syntactically reachable return on all control paths only if control-flow analysis is implemented
otherwise, reaching end of a non-void function at runtime is a runtime error

For v1, do both:

conservative static check when trivially obvious
definitive runtime check at function end

For main, reaching end may return 0 in full C++, but to keep rules simple:

require explicit return 0; in version 1, or
allow implicit return 0; for main

Recommended baseline:

allow implicit return 0 at end of int main()

6.6 Functions

6.6.1 Declaration and definition

Functions may be declared and later defined, or directly defined.

6.6.2 Call semantics

Arguments are evaluated left-to-right as a deliberate subset policy. Full C++ has historically complex sequencing rules. Since this project is an interpreter for a subset, choose a fixed order and document it.

Recommended:

evaluate function arguments left-to-right

This is slightly stricter than some historical C++ behavior, but deterministic and implementable.

6.6.3 Recursion

Direct and indirect recursion are supported unless explicitly disabled. Recommended: support recursion.

Implementation shall provide:

configurable max call depth
runtime error on stack depth exhaustion

6.6.4 Built-in functions

Built-ins are represented as ordinary callable global functions with interpreter-native implementations.

Version 1 required:

void print(int)
optional void print(bool)

Because overload resolution is unsupported, there are two implementation options:

Option A

Single polymorphic built-in outside the user function model. Simpler runtime, less C++-faithful.

Option B

Allow limited built-in overloads only, while forbidding user-defined overloads.

Recommended:

built-ins may have a small internal overload set
user-defined overloads remain unsupported

This should be explicitly documented as a runtime privilege, not general language support.

7. Syntax and Grammar Specification

The parser need not implement full ISO grammar. It shall implement a reduced grammar that accepts exactly the subset.

The grammar below is normative at the subset level, though implementation may refactor it.

7.1 Lexical tokens

identifiers

identifier ::= letter (letter | digit | "_")*

integer literal

int_literal ::= digit+

keywords

"int" "bool" "void" "if" "else" "while" "for" "return" "true" "false"

punctuators/operators

"(" ")" "{" "}" "[" "]" ";" "," "="
"+" "-" "*" "/" "%" "!" "&&" "||"
"==" "!=" "<" "<=" ">" ">="

7.2 Grammar

translation_unit
    ::= top_level_decl*

top_level_decl
    ::= function_decl
     | function_def

function_decl
    ::= type identifier "(" parameter_list_opt ")" ";"

function_def
    ::= type identifier "(" parameter_list_opt ")" compound_stmt

parameter_list_opt
    ::= /* empty */
     | parameter_list

parameter_list
    ::= parameter ("," parameter)*

parameter
    ::= type identifier
     | type identifier "[" int_literal "]"   // only if array params are supported, otherwise omit

type
    ::= "int"
     | "bool"
     | "void"

compound_stmt
    ::= "{" stmt* "}"

stmt
    ::= compound_stmt
     | decl_stmt
     | expr_stmt
     | if_stmt
     | while_stmt
     | for_stmt
     | return_stmt

decl_stmt
    ::= local_var_decl ";"

local_var_decl
    ::= type identifier
     | type identifier "=" expr
     | type identifier "[" int_literal "]"    // if arrays enabled

expr_stmt
    ::= expr_opt ";"

expr_opt
    ::= /* empty */
     | expr

if_stmt
    ::= "if" "(" expr ")" stmt ("else" stmt)?

while_stmt
    ::= "while" "(" expr ")" stmt

for_stmt
    ::= "for" "(" for_init ";" expr_opt ";" expr_opt ")" stmt

for_init
    ::= /* empty */
     | expr
     | local_var_decl

return_stmt
    ::= "return" expr_opt ";"

expr
    ::= assignment_expr

assignment_expr
    ::= logical_or_expr
     | unary_lvalue "=" assignment_expr

logical_or_expr
    ::= logical_and_expr ("||" logical_and_expr)*

logical_and_expr
    ::= equality_expr ("&&" equality_expr)*

equality_expr
    ::= relational_expr (("==" | "!=") relational_expr)*

relational_expr
    ::= additive_expr (("<" | "<=" | ">" | ">=") additive_expr)*

additive_expr
    ::= multiplicative_expr (("+" | "-") multiplicative_expr)*

multiplicative_expr
    ::= unary_expr (("*" | "/" | "%") unary_expr)*

unary_expr
    ::= primary_expr
     | "+" unary_expr
     | "-" unary_expr
     | "!" unary_expr

primary_expr
    ::= identifier
     | int_literal
     | "true"
     | "false"
     | "(" expr ")"
     | call_expr
     | array_subscript

call_expr
    ::= identifier "(" argument_list_opt ")"

argument_list_opt
    ::= /* empty */
     | argument_list

argument_list
    ::= expr ("," expr)*

array_subscript
    ::= identifier "[" expr "]"

unary_lvalue
    ::= identifier
     | array_subscript

7.3 Grammar policy notes

No expression may start with a type name; this eliminates cast ambiguity in version 1.
No declaration/expression ambiguity beyond for init should remain.
Multiple declarators are excluded to simplify grammar and semantics.

8. Preprocessing Policy

This section is crucial because “actual C++ source text” intersects with the preprocessor.

8.1 Version 1 preprocessing stance

Recommended baseline:

NotPlusPlus does not implement the C preprocessor
Source files containing preprocessing directives are rejected, except optionally a tiny whitelist for built-in headers that are semantically ignored

This is the cleanest design.

8.2 Why this is acceptable

The goal is to interpret actual C++ syntax and semantics for a subset. The preprocessor is not part of the core expression/statement/declaration grammar and introduces a separate textual transformation language. Supporting C++ source text does not require full preprocessor support in v1.

8.3 Optional compatibility mode

If desired, support exactly:

#include <npp> or #include "npp.hpp"

Semantics:

no real file loading
interpreter injects declarations for built-ins such as print

But this should only be added if there is a strong UX reason. Otherwise, it is simpler to treat built-ins as always available.

8.4 Rejected directives

#define
#if, #ifdef, etc.
#include of arbitrary headers
#pragma
#line

Diagnostic category:

unsupported_preprocessor_directive

9. Architecture

9.1 Pipeline overview

NotPlusPlus shall be implemented as a staged front-end plus interpreter:

Source Manager
Lexer
Parser
AST Builder
Semantic Analyzer
Lowered Semantic IR or Direct Annotated AST
Interpreter Runtime
Diagnostic Engine

9.2 Architectural choice: AST interpreter vs lowered IR

Two viable approaches:

Option A: Direct AST interpreter

Interpret directly over the AST with semantic annotations.

Pros:

simpler initial implementation
fewer intermediate representations
easier source-location propagation

Cons:

semantic analysis and runtime concerns may get mixed
harder to optimize later

Option B: Lowered semantic IR

Parse to AST, analyze semantically, then lower to a small control-flow/statement IR for interpretation.

Pros:

cleaner separation
easier execution engine
easier constant folding, debugging, tracing
better long-term maintainability

Cons:

more engineering upfront

Recommended architecture: hybrid.

Parse into a high-level AST
Perform semantic analysis on AST and produce a resolved, typed semantic model
Lower expressions/statements/functions into a typed executable IR for interpretation after semantic analysis succeeds

This keeps the parser close to source while giving the runtime a cleaner structure.

9.3 Subsystems

9.3.1 Source Manager

Responsibilities:

own file contents
map byte offsets to line/column
produce source spans
provide excerpt rendering for diagnostics

9.3.2 Lexer

Responsibilities:

tokenize source
skip comments and whitespace
recognize keywords and operators
report invalid tokens
attach source spans to tokens

9.3.3 Parser

Responsibilities:

consume token stream
build AST
distinguish declaration forms from expression forms within subset grammar
recover from syntax errors where practical

9.3.4 AST

Responsibilities:

preserve source structure and spans
represent declarations, statements, expressions, and types
remain syntax-level, not runtime-level

9.3.5 Semantic Analyzer

Responsibilities:

symbol table construction
declaration validation
name resolution
type checking
lvalue/rvalue classification
function signature registration
built-in injection
subset rule enforcement
unsupported construct detection

9.3.6 Executable IR Lowering

Responsibilities:

transform semantically valid AST into execution-friendly nodes
eliminate parse-only artifacts
make control flow explicit
store resolved declaration IDs and type IDs

9.3.7 Runtime / Interpreter

Responsibilities:

manage call stack
manage variable storage
evaluate expressions
execute statements
invoke built-ins
detect runtime errors

9.3.8 Diagnostic Engine

Responsibilities:

collect and render compile-time diagnostics
report runtime errors with stack trace and source spans
provide stable error codes

10. Internal Representations

10.1 Source spans

Every token and AST node shall carry a source span:

file id
start offset
end offset

Derived on demand:

line
column

10.2 Tokens

Each token:

kind
lexeme slice or interned content
source span

Token kinds include:

identifiers
literals
keywords
punctuators
eof

10.3 AST node model

A representative AST model:

program

list of top-level declarations

declarations

FunctionDecl
FunctionDef
ParamDecl
VarDecl

statements

CompoundStmt
DeclStmt
ExprStmt
IfStmt
WhileStmt
ForStmt
ReturnStmt

expressions

IntLiteralExpr
BoolLiteralExpr
NameExpr
UnaryExpr
BinaryExpr
AssignExpr
CallExpr
SubscriptExpr
ParenExpr

types

BuiltinType(Int | Bool | Void)
ArrayType(element_type, size)

Every expression node shall later carry:

resolved type
value category: lvalue or rvalue
maybe constant-value metadata if constant folding is added

10.4 Symbol model

function symbols

Fields:

name
return type
parameter types
declaration span
definition pointer if defined
builtin flag
builtin handler id if builtin

variable symbols

Fields:

name
type
scope id
declaration span
storage class category: local / parameter
runtime slot index

10.5 Type model

Represent types structurally:

Int
Bool
Void
Array(TypeId element, uint32 size)

Intern types in a central table for canonical equality.

10.6 Executable IR

Recommended IR granularity:

executable functions

name
return type
parameter slots
body block

executable statements

block
local declaration
store
if
while
for or lowered-for
return
expr statement

executable expressions

literal
load local
unary op
binary op
short-circuit logical
call resolved function id
subscript load/store address form if arrays enabled

Important: use separate lvalue-capable nodes or addressable references for assignable expressions.

11. Parsing Strategy

11.1 Parser type

Use recursive descent.

This is the correct choice for the subset because:

grammar is controlled
precedence handling is straightforward
diagnostics are readable
implementation is easy to maintain

11.2 Declaration parsing

Top-level parse logic:

parse type
parse identifier
if next token is (, parse function declaration/definition
otherwise reject, because top-level non-function declarations are unsupported

Local scope parse logic:

if token begins a supported type specifier, parse declaration statement
else parse expression statement

Because casts, user-defined types, and elaborate declarators are excluded, this remains unambiguous.

11.3 Expression parsing

Use precedence climbing or hand-written precedence functions. Recommended:

dedicated functions per precedence level

This makes associativity clear:

assignment right-associative
others left-associative

11.4 Error recovery

Parser should recover at:

;
}
top-level declaration boundaries

Recovery is important for multi-error reporting in source files.

12. Semantic Analysis

12.1 Analysis phases

Semantic analysis should be split into at least three passes.

Pass 1: declaration collection

collect all top-level function declarations and definitions
register built-ins
detect duplicate function names/signatures

Pass 2: function body analysis

For each function:

establish parameter scope
analyze statements and expressions
resolve identifiers
check types
assign local storage slots

Pass 3: whole-program validation

verify main exists with valid signature
verify every non-builtin called function exists
verify definitions for declared-but-called functions
verify no unsupported unresolved forms remain

12.2 Symbol tables

Use nested scope tables:

each scope has parent
variables inserted locally
functions stored globally

Implementation detail:

do not store functions in the same namespace structure as variables unless later needed for shadowing/lookup fidelity
since local functions are unsupported, a separate global function table is simpler

12.3 Name resolution

When encountering an identifier expression:

search local scope chain for variable
if expression form is a call, resolve as function in global function table
otherwise error if no variable found

A bare function name as value is unsupported because function pointers are unsupported.

12.4 Type checking rules

Representative rules:

unary `+` / `-`

Operand must be int, result int

unary `!`

Operand must be contextually convertible to bool, result bool

arithmetic binary

Both operands int, result int

relational

Both operands int, result bool

equality

Both operands same supported scalar type, result bool

logical `&&` / `||`

Operands contextually convertible to bool, result bool

assignment

LHS must be assignable lvalue RHS must be same type or explicitly allowed conversion Result type is LHS type

call

Function must exist Arity must match Each argument type must match parameter type

subscript

Base must be array lvalue Index must be int Result is lvalue of element type

12.5 Lvalue model

Need explicit value category classification.

Lvalues:

variable references
array element expressions

Rvalues:

literals
arithmetic expressions
comparison expressions
function calls returning non-array scalar values
parenthesized lvalues may preserve lvalue if desired, but for v1 this can be simplified only if parser/analysis tracks it properly

Recommended:

preserve lvalue-ness through parentheses

12.6 Definite return analysis

Full control-flow analysis is not required for v1. Provide:

simple structural check where obvious
runtime guard on falling off end of non-void function

Example runtime guard:

if function body completes without Return, emit runtime error “control reached end of non-void function”

12.7 Unsupported construct detection

The parser and semantic analyzer must produce specific diagnostics when unsupported but recognizable constructs are used.

Examples:

const int x = 1; → unsupported type qualifier
int* p; → unsupported pointer declarator
namespace std {} → unsupported namespace declaration
x++; → unsupported operator

This is better than generic parse failure when the construct is lexically recognizable.

13. Runtime Design

13.1 Execution model

NotPlusPlus interprets one executable function at a time using a call stack.

Execution starts at main.

13.2 Runtime value representation

Recommended runtime value enum:

Int(i32)
Bool(bool)
Array(ArrayObjectId) or inline array storage reference

Avoid boxing every scalar if performance matters, but correctness is primary.

13.3 Variable storage model

Each function activation record has local storage slots. Each local variable symbol is assigned a slot index during semantic analysis or IR lowering.

A frame contains:

function id
slots vector
maybe scope metadata if block-lifetime destruction ever matters

Each slot contains:

type id
initialized flag
value or array object reference

13.4 Block scope handling

Because variable lifetime is lexical and there are no destructors in v1, there are two implementation strategies:

Strategy A: dynamic scope stack

Push/pop runtime maps for each block.

Strategy B: fixed slot frame with lexical slot allocation

Assign each declaration a unique frame slot, valid for the lifetime of the frame; use scope metadata only to block illegal access at compile time.

Recommended:

fixed slot frame

Rationale:

simpler runtime
faster access
no need to allocate/deallocate per block
lexical rules already enforced statically

Arrays live in their variable slots.

13.5 Function calls

Call procedure:

evaluate arguments left-to-right
create new frame
initialize parameter slots with argument values
mark non-parameter locals uninitialized
execute body
on return, validate return type and yield value
pop frame

13.6 Return propagation

Use an explicit control-flow result type:

ExecOutcome =
  Normal
  Break
  Continue
  Return(Value)

13.7 Runtime arrays

If arrays are supported:

representation

Each array variable slot contains:

element type
fixed size
element storage array
initialized bitset per element

semantics

indexing performs bounds check
out-of-range access is runtime error
array expression does not decay to pointer
array value passing is unsupported unless array parameters are explicitly modeled

13.8 Built-in execution

Built-ins are dispatched by function symbol or handler id.

Example:

print(int) writes decimal integer to stdout or interpreter output sink
print(bool) writes true or false

The runtime must abstract output through an interface for testability:

real stdout sink
capture sink for unit tests

14. Diagnostics

14.1 Diagnostic classes

lexical errors

invalid character
malformed token
unterminated block comment

syntax errors

unexpected token
expected token
malformed declaration
malformed expression

semantic errors

unknown identifier
redeclaration
type mismatch
invalid assignment target
wrong argument count
wrong argument type
missing main
invalid main signature
unsupported construct

runtime errors

division by zero
modulo by zero
integer overflow if trapped
uninitialized read
array bounds violation
call depth exceeded
missing return at runtime
internal interpreter fault

14.2 Diagnostic format

Recommended structure:

severity
error code
primary source span
human-readable message
optional notes
optional related spans

Example:

error[NPP2004]: use of undeclared identifier 'x'
  --> sample.cpp:4:12
   |
4  |     y = x + 1;
   |            ^
note: no local variable or parameter named 'x' is visible in this scope

14.3 Stable error code ranges

Recommended:

NPP1xxx lexical
NPP2xxx syntax
NPP3xxx semantic
NPP4xxx runtime
NPP9xxx internal

14.4 Runtime stack traces

Runtime errors should emit:

message
source span of failing expression/statement
call stack with function names and call sites where available

15. Built-in Surface

15.1 Philosophy

Built-ins must be valid C++ function calls, not pseudo-syntax.

This preserves the design goal of accepting real C++ source syntax.

15.2 Required built-ins

Minimum:

void print(int);
void print(bool);

If overload support for built-ins is undesirable, alternative names:

void print_int(int);
void print_bool(bool);

However, print overloads are a better user experience and still manageable if isolated to built-ins.

15.3 Prelude model

The interpreter internally injects declarations equivalent to:

void print(int);
void print(bool);

These declarations are reserved. User code may not redefine them.

15.4 Output formatting

int: decimal
bool: true or false
no automatic newline unless println built-ins are added

16. Main Function Contract

16.1 Required entry point

Exactly one valid definition of:

int main()

Recommended baseline:

this is the only valid entry signature in version 1

16.2 Rejected alternatives

void main()
parameterized main
overloaded main

16.3 Return behavior

explicit return int_expr; supported
reaching end of main returns 0

17. File and Module Organization

A recommended Rust implementation layout:

notplusplus/
  src/
    main.rs
    source/
      mod.rs          # source_manager + span
    lex/
      mod.rs          # lexer entry point
      token.rs
      lexer.rs
    parse/
      mod.rs          # parser entry point
      ast.rs
      parser.rs
    sema/
      mod.rs
      types.rs
      symbols.rs
      scope.rs
      sema.rs
    ir/
      mod.rs
      ir.rs
      lower.rs
    interp/
      mod.rs
      value.rs
      frame.rs
      runtime.rs
      builtins.rs
    diag/
      mod.rs
      diagnostic.rs
      engine.rs
    support/
      mod.rs
      intern.rs
  tests/
    lexer/
    parser/
    sema/
    runtime/
    integration/
  docs/
    design.md
  Cargo.toml

Each directory is a Rust module rooted at mod.rs. Visibility is controlled via pub and pub(crate) — prefer pub(crate) for cross-module interfaces that are not part of any public API. There is no public library crate surface in v1; the binary is the product.

18. Implementation Language

18.1 Chosen implementation language: Rust

Rust is the primary implementation language for NotPlusPlus.

Reasoning:

Rust's enum-based ADTs and exhaustive pattern matching map directly and naturally onto the AST, IR, and value representation. Every node kind, every value variant, and every diagnostic category becomes a type-checked variant. Adding or removing a variant produces compile errors at every unhandled match site, which enforces consistency across the pipeline automatically.
The ownership model eliminates a class of bugs common in hand-written interpreters: use-after-free in value frames, dangling references into scope stacks, and double-free in runtime environments. These are precisely the failure modes that matter in an interpreter managing its own call stack and variable storage.
Rust has no garbage collector. The interpreter controls its own memory layout for call frames and runtime values, which is preferable for a system that tracks initialization state per variable slot and enforces configurable call-depth limits.
The Result and Option types enforce explicit error handling throughout the pipeline. Diagnostic emission cannot be accidentally silenced; every fallible operation must be handled at the call site.
The Rust ecosystem provides mature support for the diagnostic infrastructure this project requires. Crates such as miette and codespan-reporting provide span-aware, terminal-formatted error output without bespoke implementation effort.
Rust's test infrastructure — #[test], #[cfg(test)], and the integration test convention under tests/ — maps directly onto the layered test strategy defined in §22 without any additional tooling.

18.2 Implications for the codebase

The pipeline stages defined in §3.7 translate to Rust modules as follows. The lexer produces a flat Vec<Token> with span metadata. The parser consumes tokens and produces an owned AST using Box<Expr> and Vec<Stmt> for recursive structure. The semantic analyzer walks the AST and produces a resolved, typed semantic model with symbol tables, scopes, and expression annotations. IR lowering then translates that validated semantic model into executable IR. The interpreter walks the IR using a call stack of Frame values, each holding a slot array for local variables. All inter-stage errors are returned as structured Diagnostic values accumulated in a shared engine rather than panicked or printed inline.

panic! is reserved for genuinely impossible internal states — conditions that represent interpreter bugs, not user program errors. All user-facing failures travel through the diagnostic engine.

18.3 Rejected alternatives

C++ was considered for its symbolic symmetry with the project's subject matter. It is rejected because building a correct, safe interpreter runtime in C++ requires disciplined manual memory management that adds implementation risk without design benefit. The project's value is in its semantic correctness, not its implementation language irony.

Python is suitable for early prototyping but is not appropriate as the final implementation language. The absence of static types across the pipeline makes it harder to enforce the invariants that the design depends on — particularly around type-checking, IR lowering, and frame management.

19. Detailed Execution Semantics

19.1 Evaluation order

Version 1 shall define deterministic evaluation order even where older/full C++ rules are historically subtle.

Recommended:

binary operator operands evaluated left-to-right except short-circuit forms, which obey short-circuit
function call arguments evaluated left-to-right
assignment evaluates RHS after LHS addressability check but before store
subscript evaluates base then index

This is a conscious simplification. It must be documented as a subset semantic choice.

19.2 Undefined behavior policy

NotPlusPlus is not required to emulate all undefined behavior of ISO C++. For supported constructs:

some UB-like conditions are rejected statically where possible
some are trapped dynamically with deterministic runtime errors

Examples:

uninitialized read → runtime error
signed overflow → runtime error if checked arithmetic chosen
division by zero → runtime error

This is acceptable because the subset contract is explicit and the interpreter semantics are deterministic.

19.3 Statement execution details

declaration statement

allocate or locate variable slot
if initializer present: evaluate, type-check, store, mark initialized
else mark uninitialized

expression statement

evaluate for side effects
discard result

if statement

evaluate condition
contextually convert to bool
execute one branch

while statement

reevaluate condition before every iteration
short-circuit semantics inside condition preserved

for statement

Logical execution model:

execute init if present
if cond present, test it; else treat as true
execute body
execute step if present
repeat

If init is a declaration, its scope includes cond, step, and body, and ends after the loop.

20. Arrays Design

This section is normative if arrays are in scope for v1; otherwise it is phase-2 design.

20.1 Supported array forms

local fixed-size arrays only
element types: int, bool
one-dimensional only

20.2 Syntax

int a[5];
bool seen[10];
a[0] = 42;
print(a[0]);

20.3 Semantics

storage duration: function activation/frame lifetime
indexing requires integer index
bounds checked
no array-to-pointer decay
arrays are not first-class assignable values

20.4 Parameter passing

Strong recommendation for version 1:

do not support array parameters

Rationale:

real C++ adjusts array parameters to pointers
pointers are out of scope
modeling this faithfully without pointers is awkward

So:

array types allowed only for local variables

21. Unsupported Features and Rejection Policy

This section must remain explicit for product integrity.

21.2 Rejection requirements

The system shall reject unsupported constructs with targeted diagnostics wherever practical.

Example:

std::cout << x;

Preferred diagnostic:

error[NPP3018]: stream insertion expressions are unsupported

not merely:

error: expected ';'

22. Testing Strategy

22.1 Test layers

lexer tests

tokenization correctness
comment handling
integer literal scanning
operator scanning
source span correctness

parser tests

function declarations and definitions
expression precedence
statement forms
syntax error recovery

semantic tests

scope resolution
shadowing
redeclaration errors
type mismatch errors
return validity
call resolution
unsupported construct rejection

runtime tests

arithmetic
conditions
loops
recursion
builtin output
runtime errors

integration tests

end-to-end source file execution
output capture comparison
diagnostic golden files

22.2 Golden tests

Use golden files for:

diagnostics
stack traces
program output

22.3 Required sample programs

At minimum:

arithmetic

int main() {
    int x = 2 + 3 * 4;
    print(x);
    return 0;
}

block shadowing

int main() {
    int x = 1;
    {
        int x = 2;
        print(x);
    }
    print(x);
    return 0;
}

function call

int add(int a, int b) {
    return a + b;
}
int main() {
    print(add(10, 20));
    return 0;
}

while loop

int main() {
    int i = 0;
    while (i < 3) {
        print(i);
        i = i + 1;
    }
    return 0;
}

for loop

int main() {
    for (int i = 0; i < 3; i = i + 1) {
        print(i);
    }
    return 0;
}

recursion

int fact(int n) {
    if (n == 0) {
        return 1;
    }
    return n * fact(n - 1);
}
int main() {
    print(fact(5));
    return 0;
}

runtime uninitialized read

int main() {
    int x;
    print(x);
    return 0;
}

unsupported pointer

int main() {
    int* p;
    return 0;
}

23. Milestone Plan

23.1 Milestone 0: skeleton

Deliverables:

project scaffolding
source manager
diagnostics base
token definitions

Exit criteria:

build system works
diagnostic rendering works

23.2 Milestone 1: lexer

Deliverables:

comments
identifiers
literals
punctuation/operators
keyword recognition

Exit criteria:

lexer golden tests pass

23.3 Milestone 2: parser core

Deliverables:

function parse
statement parse
expression precedence parse
AST generation

Exit criteria:

parser accepts basic programs
syntax errors reported correctly

23.4 Milestone 3: semantic analysis

Deliverables:

function table
variable scopes
type checking
main validation
analyzed AST / semantic model for validated programs

Exit criteria:

semantic test corpus passes
unsupported constructs rejected accurately

23.5 Milestone 4: interpreter core

Deliverables:

typed IR lowering from the analyzed AST / semantic model
scalar runtime values
statements and expressions
function call stack
returns
built-ins

Exit criteria:

arithmetic, control flow, functions work end-to-end

23.6 Milestone 5: loops and recursion hardening

Deliverables:

for
recursion
stack traces
runtime error reporting

Exit criteria:

integration tests stable

23.7 Milestone 6: arrays

Deliverables:

array declaration
indexing
bounds checks
initialization tracking

Exit criteria:

array tests stable

23.8 Milestone 7: polish

Deliverables:

improved diagnostics
CLI options
trace mode or debug dump mode
documentation synchronization

Exit criteria:

v1 release candidate

24. CLI Design

24.1 Basic invocation

npp program.cpp

24.2 Suggested options

--dump-tokens
--dump-ast
--dump-sema
--dump-ir
--trace-exec
--no-color
--max-call-depth=N

24.3 Exit codes

0: program ran successfully and returned 0
non-zero program return code may map to process exit code if desired
dedicated interpreter failure codes for diagnostics/runtime failures

Recommended:

compilation/semantic failure → exit 2
runtime failure → exit 3
internal failure → exit 4
successful program execution → program return code modulo process constraints

25. Determinism and Reproducibility

NotPlusPlus shall avoid behavior depending on:

unordered container iteration
host integer overflow semantics
locale-dependent formatting
platform-specific newline handling beyond normalized I/O

All runtime-visible semantics must be deterministic.

26. Performance Expectations

Performance is secondary to correctness for v1.

Expected scale:

single-file programs
tens to low hundreds of functions
small recursion depths
small arrays
low-latency interpretation for educational/demo/scripting workloads

No optimization pipeline is required.

27. Security and Robustness

27.1 Untrusted input

Source code is untrusted input. The interpreter must guard against:

infinite recursion causing host stack overflow
pathological parse recursion where practical
excessive memory allocation from huge array sizes
integer overflow in internal indexing

27.2 Runtime limits

Configurable limits:

maximum source size
maximum call depth
maximum array size
maximum total allocated runtime storage

27.3 Internal assertions

Use assertions for impossible states, but surface recoverable user-facing failures as diagnostics or runtime errors.

28. Risks and Mitigations

28.1 Scope creep into real C++ front-end complexity

Risk:

adding just one more feature like pointers or references causes cascading design complexity

Mitigation:

freeze v1 subset
require explicit design amendment for each feature addition

28.2 Grammar drift into non-C++ behavior

Risk:

parser convenience may accidentally accept syntax that is not C++

Mitigation:

every grammar addition must map to real C++ syntax
no custom statements or operators

28.3 Built-ins becoming a fake language surface

Risk:

too many magic functions create non-C++ semantics

Mitigation:

keep built-ins minimal
model them as ordinary global functions

28.4 Semantic ambiguity around conversions

Risk:

partial C++ conversion rules become inconsistent

Mitigation:

keep conversion lattice deliberately tiny and documented
prefer strict exact-match rules except contextual bool conversion

29. Version 1 Product Contract

A source program is accepted by NotPlusPlus v1 if and only if:

it consists of supported top-level function declarations/definitions
every declaration, statement, and expression belongs to the supported subset
type checking succeeds under the subset rules
exactly one valid entry point int main() exists
all runtime operations stay within defined execution constraints

A program outside that contract is rejected.

30. Recommended Final v1 Feature Set

This is the strongest recommended baseline for a coherent first release.

30.1 Must-have

single translation unit
comments
int, bool, void
function declarations and definitions
local variables
block scope
integer/boolean literals
arithmetic/comparison/logical expressions
assignment
if, while, for
return
int main()
built-in print(int) and print(bool)
recursion
semantic diagnostics
runtime error handling
deterministic evaluation order

30.2 Should-have

array local variables with indexing and bounds checks
AST and IR dump modes
stack traces for runtime errors
fixed slot allocation per frame

30.3 Should-not-have in v1

macros
headers
namespaces
pointers/references
user overloads
strings
classes
templates
exceptions

31. Example Accepted Programs

31.1 Basic arithmetic

int main() {
    int x = 10;
    int y = 20;
    print(x + y);
    return 0;
}

31.2 Boolean control flow

bool gt(int a, int b) {
    return a > b;
}

int main() {
    if (gt(5, 3)) {
        print(true);
    } else {
        print(false);
    }
    return 0;
}

31.3 Loop and scope

int main() {
    int x = 0;
    for (int i = 0; i < 3; i = i + 1) {
        int x = i;
        print(x);
    }
    print(x);
    return 0;
}

31.4 Arrays if enabled

int main() {
    int a[3];
    a[0] = 4;
    a[1] = 5;
    a[2] = a[0] + a[1];
    print(a[2]);
    return 0;
}

32. Example Rejected Programs

32.1 Pointer usage

int main() {
    int* p;
    return 0;
}

Reason: pointers unsupported.

32.2 String literal

int main() {
    print("hello");
    return 0;
}

Reason: string literals unsupported.

32.3 Namespace qualification

int main() {
    std::cout << 1;
    return 0;
}

Reason: namespaces and stream insertion unsupported.

32.4 Multiple declarators

int main() {
    int a = 1, b = 2;
    return 0;
}

Reason: multi-declarator declarations unsupported in v1.

33. Amendments Policy

This document is the design contract for v1. Any feature addition must be recorded as an amendment specifying:

syntax accepted
semantics
diagnostics
runtime representation impact
interaction with existing features
migration impact on tests and docs

No feature should be added informally.

33.1 Amendment: Milestone 3 boundary

Milestone 3 ends after semantic analysis produces a resolved, typed semantic model over the AST. This model includes function symbols, variable scopes, name-resolution results, and expression type/lvalue annotations.

Typed executable IR lowering is deferred to milestone 4. Milestone 3 therefore validates programs semantically but does not yet require executable IR construction.

33.2 Amendment: Scope binding storage model

The semantic analyzer's variable binding table shall be keyed by ScopeId, not by positional index into a parallel vector. The binding structure shall be a map from ScopeId to a map from name to VarId:

bindings: HashMap<ScopeId, HashMap<String, VarId>>

Rationale

Using a positional Vec indexed by ScopeId ordinal couples two independent allocation sequences: scope creation in the ScopeTree and entry creation in the bindings table. Any code path that creates a scope without a corresponding bindings push — or vice versa — produces silent index misalignment or a panic. A HashMap<ScopeId, ...> structure makes the association explicit and eliminates this coupling.

Interaction with existing features

Variable lookup (§12.2) and shadowing (§6.1.3) behavior are unchanged. The scope tree's parent chain remains the authority for lexical lookup order. Only the internal storage representation changes.

Migration impact

All semantic analysis code that indexes into self.bindings[scope.0] must be replaced with keyed access. No test semantics change; only the internal data structure changes.

33.3 Amendment: Preprocessor directive diagnostics

Source files containing preprocessing directives shall be rejected with a targeted diagnostic of category unsupported_preprocessor_directive, not with a generic invalid-character error for #.

Syntax recognized

A line whose first non-whitespace character is # shall be recognized by the lexer as a preprocessing directive line.

Diagnostics

The lexer shall emit a diagnostic with a dedicated code in the NPP1xxx range when encountering a #-prefixed directive. The diagnostic message shall identify the specific directive where recognizable (e.g., #include, #define, #ifdef, #pragma) and fall back to a generic "preprocessing directives are unsupported" message otherwise.

Example:

error[NPP1004]: preprocessing directive '#include' is unsupported
  --> sample.cpp:1:1
   |
1  | #include <iostream>
   | ^^^^^^^^^

Semantics

No preprocessing is performed. The directive line is consumed and skipped after diagnostic emission to allow continued lexing of subsequent source.

Interaction with existing features

This amends §8 (Preprocessing Policy) by specifying the diagnostic mechanism. The lexer's existing invalid-character path for # is superseded by this targeted recognition.

33.4 Amendment: Parenthesized expression representation in semantic model

The AnalyzedExprKind::Paren variant shall hold an owned inner expression, not a clone of a separately stored expression. The semantic model shall avoid cloning expression trees for parenthesized expressions.

Representation

Paren(Box<AnalyzedExpr>)

The inner expression is moved into the Paren wrapper. No separate copy exists.

Semantics

Parenthesized expressions preserve the type and lvalue status of the inner expression exactly. This is unchanged from the existing rule (§12.5).

Interaction with existing features

IR lowering already erases parentheses by recursing through Paren nodes. The semantic model representation change does not affect lowered IR or runtime behavior.

33.5 Amendment: Scope lifetime and cleanup in semantic analysis

Scope binding entries created during semantic analysis persist for the duration of analysis. There is no requirement to deallocate or "pop" binding entries when leaving a scope.

Rationale

Because ScopeId values are unique and monotonically assigned, stale binding entries from exited scopes are unreachable through the lookup chain (which walks parent links from the current scope). Retaining them is harmless and simplifies the analyzer.

This amendment makes the existing behavior an explicit design choice rather than an accidental omission.

Constraint

The lookup procedure (§6.1.2, §12.3) shall never consult a scope that is not an ancestor of the current scope. This invariant is enforced by walking the ScopeTree parent chain and is independent of whether binding entries for unrelated scopes exist.

33.6 Amendment: `break` and `continue` statement support

break and continue are supported statements in version 1.

Syntax

break_stmt ::= "break" ";"
continue_stmt ::= "continue" ";"

Keywords

break and continue are added to the supported keyword set (§5.3).

Semantics

break immediately exits the innermost enclosing while or for loop. continue skips the remainder of the current iteration and proceeds to the loop's condition re-evaluation (for while) or step expression followed by condition re-evaluation (for for).

Both are semantic errors if used outside a loop body.

Diagnostics

NPP3010: break or continue used outside of a loop body.

Runtime representation

The executable IR includes Break(Span) and Continue(Span) statement variants. The interpreter's execution flow enum includes Break and Continue variants alongside Normal and Return.

Interaction with existing features

break and continue interact with for and while loops. They do not interact with if or compound statements beyond propagating through them. A break or continue that escapes a function body is an internal error.

33.7 Amendment: Declared-but-unused function declarations

A function that is declared but never defined and never called is not an error. The semantic analyzer shall only emit an error for a declared-but-undefined function when it is referenced in a call expression.

Rationale

This matches C++ behavior where forward declarations without definitions are permitted as long as no definition is required by the linker. Since NotPlusPlus has no separate compilation, the analogue is call-site usage.

Diagnostics

No diagnostic is emitted for unused forward declarations. The existing NPP3012 diagnostic ("function '...' is declared but never defined") is emitted only when a call to such a function is encountered.

FilesExpand file tree

design.md

Latest commit

History

design.md

File metadata and controls

NotPlusPlus Design Document

1. Document Purpose

2. Product Definition

2.1 Product name

2.2 Core objective

2.3 Product positioning

2.4 Primary use case

2.5 Non-goal framing

3. Design Principles

3.1 Real C++ first

3.2 Explicit subset contract

3.3 No silent fallback semantics

3.4 Parseability and semantic tractability

3.5 Deterministic behavior

3.6 Diagnostics as part of the product

3.7 Layered pipeline

4. Scope Summary

4.1 Supported in version 1

translation unit structure

types

expressions

statements

declarations

runtime/library surface

execution

4.2 Excluded from version 1

5. Supported Language Definition

5.1 Source model

5.2 Translation units

5.3 Keywords

5.4 Comments

5.5 Literals

5.6 Types

5.6.1 Fundamental types

5.6.2 Arrays

5.6.3 Type system rules

5.7 Declarators

5.7.1 Variable declarators

5.7.2 Function declarators

5.8 Statements

5.8.1 Compound statement

5.8.2 Declaration statement

5.8.3 Expression statement

5.8.4 If statement

5.8.5 While statement

5.8.6 For statement

5.8.7 Return statement

5.9 Expressions

5.9.1 Primary expressions

5.9.2 Unary expressions

5.9.3 Binary expressions

5.9.4 Assignment

5.9.5 Function call

5.9.6 Short-circuit behavior

6. Semantic Rules

6.1 Name lookup and scopes

6.1.1 Scope kinds

6.1.2 Variable lookup

6.1.3 Shadowing

6.1.4 Redeclaration

6.2 Type system

6.2.1 Static typing

6.2.2 Supported implicit conversions

Recommended v1 rule set

Strict baseline

6.3 Initialization

6.3.1 Local variables

6.3.2 Arrays

6.4 Operator semantics

6.4.1 Integer arithmetic

6.4.2 Division and modulo

6.4.3 Logical operators

6.4.4 Comparison

6.4.5 Assignment