Skip to content

Eliminate the any type from Casa #147

@frendsick

Description

@frendsick

Problem Statement

Casa today exposes an any type that acts as a wildcard escape hatch in the otherwise statically-typed language. It appears in three places:

  1. User-facing surface. The keyword any is a valid type. The cast syntax 42 (any) = x:int x is documented as a narrowing idiom. Empty [] array literals default to array[any]. Builtin signatures use any: drop: any -> None, dup: any -> any any, swap, over, rot, print: any -> None, typeof: any -> str, comparison ops ==/!=/<... all any any -> bool, store8/16/32/64: any ptr -> None, syscall1..6 taking any args.
  2. Compiler internals. The constant ANY_TYPE = "any" is used as a wildcard in types_match (any matches anything), unify_type (any absorbs into the other side), as a stack-underflow recovery placeholder, and as a fallthrough for method dispatch when the receiver type is unknown.
  3. Runtime dispatch. print accepts any and runtime-branches to PrintInt/PrintStr/PrintBool/PrintChar/PrintCstr based on the popped type.

The wildcard masks real bugs: passing a custom struct to print silently emits PrintInt garbage; passing a List[int] to syscall1 corrupts registers without complaint; (any) lets users sidestep static typing on demand. The compiler's wildcard sentinel suppresses cascading type errors and lets latent bugs slip through self-host compilation.

Solution

Remove any from the language entirely. No keyword, no ANY_TYPE constant, no (any) cast. Replace each role with explicit generics and trait bounds:

  • Stack ops (drop, dup, swap, over, rot) become surface generics with explicit type parameters.
  • Comparisons (==, !=, <, <=, >, >=) and print become trait-bounded — Eq, Ord, Display traits gate them.
  • Memory store operations (store8/16/32/64) and syscall arguments require a new Word marker trait, auto-satisfied by all single-slot values.
  • typeof is retained but retyped [T] T -> str (destructive — consumes the value, pushes the static type name).
  • (any) cast is rejected at parse time. Empty [] infers element type lazily; errors only if unresolved at end of inference.
  • The compiler's wildcard sentinel becomes an internal TYPE_UNKNOWN value, never surface-visible. Method-dispatch fallback through ANY_TYPE is removed and becomes a real type error.

Auto-derive for the new traits is out of scope for this PRD (separate follow-up). Until then, user types needing Eq/Ord/Display write the impls manually.

User Stories

  1. As a Casa user, I want print my_value to either compile and print a sensible representation or fail at compile time, so that I never see PrintInt garbage from a struct.
  2. As a Casa user, I want == on a custom enum to require #[derive(Eq)] (in the follow-up PRD) or a manual impl Eq block, so that comparison cannot silently succeed on types where it makes no semantic sense.
  3. As a Casa user, I want comparison operators (< <= > >=) to require T: Ord, so that ordering on types with no defined order is a compile error rather than a wildcard pass.
  4. As a Casa user, I want syscall1..6 to reject argument types that are not single-slot values, so that I cannot accidentally pass a str (heap object) where the kernel expects a register-sized integer.
  5. As a Casa user, I want store8/16/32/64 to reject types that don't fit in the store width, so that I get a compile error instead of silent truncation.
  6. As a Casa user, I want to write polymorphic stack-shaped functions (fn second[T1 T2] T1 T2 -> T1 { swap drop }) using the same generics syntax I already use elsewhere, so that the language has one mechanism for polymorphism.
  7. As a Casa user, I want typeof to remain available for debug output, retyped as [T] T -> str so its signature is explicit and consistent with the rest of the language.
  8. As a Casa user, I want the (any) cast removed so that I can no longer silently bypass type checking; when I need to assert a type I write the concrete type.
  9. As a Casa user, I want empty array literals [] to infer their element type from later use, so that idiomatic patterns like [] = xs:List[int] ... xs.push continue to work without explicit annotation.
  10. As a Casa user, I want a clear error message when an empty [] cannot be inferred, so that I know to add an annotation rather than getting a silent array[any] propagation.
  11. As a Casa user, I want print of an int, str, bool, char, or cstr to keep working with no changes — the existing built-in impls of Display for those primitives mean my code is unchanged.
  12. As a Casa user defining a custom struct or enum, I want a way to make my type printable by writing impl Display for MyType { fn to_str self -> str { ... } }, so that I can opt my types into print without a derive.
  13. As a Casa user, I want Map[K V] and Set[K] to keep working, with Hashable extending Eq so the eq method is reused, so that I don't have to write eq twice.
  14. As a Casa compiler maintainer, I want the wildcard ANY_TYPE sentinel removed from types_match and unify_type, so that latent type errors are surfaced rather than suppressed.
  15. As a Casa compiler maintainer, I want stack-underflow recovery to use a dedicated TYPE_UNKNOWN sentinel (never user-visible), so that error recovery does not bleed into method dispatch or signature matching.
  16. As a Casa compiler maintainer, I want the method-dispatch fallback for ANY_TYPE receivers removed, so that "method not found" errors are clear rather than masked by a name-only search.
  17. As a Casa compiler maintainer, I want is_builtin_type and the parse-type fallback to stop recognizing "any", so that the keyword is gone end-to-end.
  18. As a Casa compiler maintainer, I want self-host bootstrap to remain byte-identical (./casac compiler/main.casa -o new && ./new compiler/main.casa -o new2 && diff new new2), so that the compiler remains self-consistent through this migration.
  19. As a Casa compiler maintainer, I want the LSP signature display strings to show the new generic builtin signatures (no any), so that hover and completion reflect the actual type contracts.
  20. As a Casa documentation reader, I want docs/types-and-literals.md, docs/functions-and-lambdas.md, docs/traits.md, docs/standard-library.md, and docs/STYLE.md updated to remove any references and document the new traits, so that the language reference is current.
  21. As a Casa user reading existing test cases (tests/compiler/test_type_annotations.casa), I want the (any) examples migrated to plain annotations so the tests still pass and demonstrate the supported forms.
  22. As a future PRD author, I want auto-derive for Eq/Ord/Display to be tracked separately so this PRD ships in a focused chunk.
  23. As a Casa user passing struct or array references through store64 or syscall*, I want Word to accept any single-slot value (including struct/array refs), so that existing memory-pointer and reference-passing patterns continue to work without explicit casts.

Implementation Decisions

New trait taxonomy

Four traits added to lib/std.casa:

  • Eqfn eq self other -> bool (required), fn ne self other -> bool (default impl eq !).
  • Ordfn lt self other -> bool (required); default le, gt, ge derived from lt plus Eq::eq. Ord extends Eq (supertrait).
  • Displayfn to_str self -> str (required).
  • Word — empty marker trait. No methods. Satisfied by any single-slot value (int, bool, char, ptr, cstr, enum, struct refs, array refs). Auto-applied by the compiler at type-definition time; users cannot impl Word manually.

Hashable is migrated to extend Eq (trait Hashable: Eq). The standalone Hashable::eq is removed; existing impls of Hashable must also impl Eq. If the trait system does not yet support supertrait constraints (TraitBound machinery), it is extended in this PRD's scope.

Builtin signatures

fn drop[T]         T -> None
fn dup[T]          T -> T T
fn swap[T1 T2]     T1 T2 -> T2 T1
fn over[T1 T2]     T1 T2 -> T1 T2 T1
fn rot[T1 T2 T3]   T1 T2 T3 -> T2 T3 T1

fn ==[T: Eq]       T T -> bool
fn !=[T: Eq]       T T -> bool
fn <[T: Ord]       T T -> bool
fn <=[T: Ord]      T T -> bool
fn >[T: Ord]       T T -> bool
fn >=[T: Ord]      T T -> bool

fn print[T: Display]   T -> None
fn typeof[T]           T -> str

fn store8[T: Word]     T ptr -> None
fn store16[T: Word]    T ptr -> None
fn store32[T: Word]    T ptr -> None
fn store64[T: Word]    T ptr -> None

fn syscall1[A1: Word]                                          int A1 -> int
fn syscall2[A1: Word A2: Word]                                 int A1 A2 -> int
fn syscall3[A1: Word A2: Word A3: Word]                        int A1 A2 A3 -> int
fn syscall4[A1: Word A2: Word A3: Word A4: Word]               int A1 A2 A3 A4 -> int
fn syscall5[A1: Word A2: Word A3: Word A4: Word A5: Word]      int A1 A2 A3 A4 A5 -> int
fn syscall6[A1: Word A2: Word A3: Word A4: Word A5: Word A6: Word]
                                                               int A1 A2 A3 A4 A5 A6 -> int

Compiler-internal changes

  • Remove the ANY_TYPE constant. Add a TYPE_UNKNOWN sentinel string used only by stack-underflow recovery and unbound-type-variable propagation. Never appears in error messages or signatures.
  • types_match and unify_type are updated to recognize TYPE_UNKNOWN (silently passes through, like the old ANY_TYPE) but fully reject the literal string "any".
  • stack_pop underflow continues to either emit TYPE_UNKNOWN or auto-promote to a fresh inferred type variable on inferred-param call paths (existing behaviour, just a renamed sentinel).
  • parse_type rejects the any keyword with a hard error: "type any does not exist; use a concrete type or a generic parameter."
  • get_op_type_cast rejects (any) at parse time with the same error.
  • is_builtin_type no longer lists "any".
  • Method-dispatch fallback for ANY_TYPE receivers (current find_method_by_name last-resort branch) is removed. A receiver with unknown type is now a real type error.

print for user Display types

Primitive Display impls (int, bool, char, str, cstr) continue to dispatch through the existing specialized PrintInt/PrintStr/PrintBool/PrintChar/PrintCstr instructions in check_io. For user types implementing Display, the compiler lowers value print to value to_str followed by the existing PrintStr instruction. No new bytecode instruction is introduced.

Empty [] literal

The literal pushes array[TYPE_UNKNOWN] into the type stack. The unifier resolves it lazily on first push, assignment, or constraint. If unresolved when the inference frame closes, the compiler emits "cannot infer element type of empty array; add a type annotation."

Migration of (any) cast use sites

The only documented user-facing pattern is 42 (any) = x:int x (test_type_annotations.casa, STYLE.md). The cast is a no-op when an explicit annotation is present at the binding, so migration deletes the cast: 42 = x:int x. Tests and docs are updated accordingly.

Out-of-scope adjacents

  • Auto-derive for Eq, Ord, Display is a separate PRD/issue, layered on the ongoing feat/derive-enum-hashable work.
  • A proper lexicographic Ord impl for str ships as part of this PRD only if the trait bound on < is needed for str comparison; otherwise it can be deferred to a follow-up. (Marked as a sub-decision for the implementation issues.)

Testing Decisions

Good tests assert end-to-end externally observable behaviour: a Casa source program compiles or fails to compile with the expected error, runs and prints the expected output, or self-hosts byte-identically. Tests do not depend on the internal name TYPE_UNKNOWN or the shape of trait dispatch tables.

Modules that need updated or new tests:

  • tests/compiler/test_type_annotations.casa — migrate (any) cases to plain annotations; add a negative test that (any) is now a parse error.
  • tests/compiler/test_typeof.casa — update for the new [T] T -> str signature (still destructive).
  • tests/compiler/test_typechecker.casa and tests/compiler/test_common.casa — remove any matches int / int matches any style tests; add tests that the any keyword is rejected and that bare == on a non-Eq type is a real error.
  • tests/test_compiler.sh and tests/test_examples.sh — must remain green throughout the migration. Run tests/test_compiler.sh < /dev/null to avoid the LSP-test stdin hang.
  • Self-host byte-identity — add (or verify in CI) a check that ./casac compiler/main.casa -o new && ./new compiler/main.casa -o new2 && cmp new new2 succeeds. This is the strongest guarantee that the compiler still self-hosts after any removal.
  • Negative trait-bound tests — for each new trait (Eq, Ord, Display, Word), add a test that compiling code which violates the bound fails with a readable error.

Prior art for these tests lives in tests/compiler/ (pattern-matching style: input casa source plus expected stdout/stderr/exit-code) and tests/test_examples.sh (golden-output regeneration when examples change).

Out of Scope

  • Auto-derive (#[derive(Eq)], #[derive(Ord)], #[derive(Display)]) — separate follow-up PRD layered on the existing derive infrastructure on feat/derive-enum-hashable.
  • Higher-kinded types, variance annotations, dyn-trait objects.
  • Lexicographic Ord for str — may be deferred to a sub-issue if not required for this PRD's compile-green target.
  • Migration of any third-party Casa code — none tracked.
  • A new PrintDisplay bytecode instruction — explicitly avoided; user Display calls compile-time inline as to_str + PrintStr.
  • Changes to the function-pointer / closure type system.

Further Notes

  • Casa is self-hosted. Every change must build the compiler with itself. Migration is sequenced as small commits (one functionality per commit per project convention) and the compiler must remain green after each commit.
  • The PRD will be broken into independently-grabbable issues via /prd-to-issues, biased toward tracer-bullet vertical slices: e.g. "introduce TYPE_UNKNOWN sentinel as alias of ANY_TYPE", "add Eq/Ord/Display traits + primitive impls", "migrate stack ops to surface generics", "migrate print to T: Display", "migrate ==/< to Eq/Ord", "migrate store/syscall to Word", "retype typeof", "reject (any) cast at parse", "delete ANY_TYPE / any keyword", "update LSP and docs".
  • After each commit: ./casafmt all changed .casa files, run tests/test_compiler.sh < /dev/null and tests/test_examples.sh, regenerate expected outputs for any example whose printout changes.
  • Project rules: sign commits, no Claude/AI references in commit/PR text, never modify main directly, always use feature-dev:code-reviewer before opening a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions