Skip to content

feat: improve static analysis — type inference, framework awareness, func refs#56

Open
Zarvhos wants to merge 8 commits intoharshkedia177:mainfrom
Zarvhos:feat/improve-static-analysis-55
Open

feat: improve static analysis — type inference, framework awareness, func refs#56
Zarvhos wants to merge 8 commits intoharshkedia177:mainfrom
Zarvhos:feat/improve-static-analysis-55

Conversation

@Zarvhos
Copy link
Copy Markdown

@Zarvhos Zarvhos commented Mar 12, 2026

Summary

Improves Axon's static analysis pipeline across four dimensions, addressing #55:

  • Type inference for receiver resolution: builds per-file {variable: class} mappings from annotations (user: User), parameter types (def f(user: User)), and constructor calls (user = User()). Resolves user.save()User.save instead of dropping unresolved receiver calls.
  • First-class function references: detects handler = my_func and JSX onClick={handleClick} patterns, creating CALLS edges with 0.7 confidence discount.
  • Module-level call attribution: calls outside any function body (e.g. app = FastAPI()) are now attributed to the File node instead of being silently dropped.
  • Framework-aware entry points & dead code: expanded detection for FastAPI verbs (@app.get/post/put/delete/patch), Next.js data fetching (getServerSideProps, generateStaticParams, etc.), Alembic migrations (upgrade/downgrade), and Pydantic decorators (computed_field, model_serializer, field_serializer).

Also extracts shared path utilities (is_alembic_migration, is_test_file) into a dedicated path_utils module to eliminate duplication.

Benchmark results (4 real-world repos)

Repo Branch Files Syms Calls Entry Dead FRef
FastAPI Template main 131 470 649 222 84
feat 131 470 998 145 34 64
Bulletproof React main 447 493 359 241 81
feat 447 493 879 121 14 282
AI Hedge Fund main 42 253 264 3 78
feat 42 253 294 2 63 7
Axon main 137 1630 2754 729 46
feat 137 1698 3228 740 9 131
  • Calls edges: +54% to +145% more resolved (type inference + func refs + module-level)
  • Entry points: better calibrated (fewer false positives from "no incoming calls" heuristic)
  • Dead code: -60% to -83% false positives (framework decorators + Alembic + EXTENDS check)
  • No regressions: file counts and symbol counts stable, performance unchanged

Changes by commit

  1. feat: add FuncRef, assignment_target, and variable_name to parser models
  2. feat: extract func refs, assignment targets, and variable names in Python parser
  3. feat: extract JSX calls, func refs, and assignment targets in TypeScript parser
  4. feat: add type inference, module-level calls, and func ref resolution
  5. feat: add FastAPI, Next.js, and Alembic entry point detection
  6. feat: reduce dead code false positives with new exemptions
  7. test: add comprehensive tests for static analysis improvements
  8. refactor: extract shared path utilities into path_utils module

Test plan

  • All 738 tests pass (including ~30 new tests)
  • Benchmarked on 4 real-world repos (FastAPI, React, Python ML, Axon itself)
  • No regressions in symbol counts, file counts, or performance
  • Ruff lint and format clean

@Zarvhos Zarvhos force-pushed the feat/improve-static-analysis-55 branch from 4039ed3 to 230cb7e Compare March 12, 2026 16:10
Zarvhos added 8 commits March 12, 2026 17:11
Extend the parser data model to support three new static analysis capabilities:

- FuncRef dataclass: represents first-class function references like
  `handler = my_func`, enabling call graph edges for indirect references.
- CallInfo.assignment_target: captures the variable name when a call result
  is assigned (e.g. `user` in `user = User()`), enabling type inference
  for receiver method resolution.
- TypeRef.variable_name: captures the annotated variable name (e.g. `config`
  in `config: AppConfig`), enabling type-aware method resolution.

Closes part of harshkedia177#55.
…thon parser

Enhance the Python tree-sitter parser with three new extraction capabilities:

- _try_extract_func_ref(): detects `handler = my_func` patterns and emits
  FuncRef entries, filtering out constants (ALL_CAPS) and private names (_...).
- _find_assignment_target(): extracts the variable name from `x = Foo()`,
  `x: T = Foo()`, and `x = await Foo()` assignment patterns.
- Variable annotation extraction now captures the variable name (e.g.
  `config` from `config: AppConfig = ...`) into TypeRef.variable_name.

These enable downstream type inference and first-class function tracking.
…ipt parser

Enhance the TypeScript tree-sitter parser with four new capabilities:

- JSX component calls: `<MyComponent />` and `<Ns.Component />` emit
  CallInfo entries (PascalCase only, skipping HTML intrinsics).
- JSX callback props: `onClick={handleClick}` emits FuncRef entries
  for bare identifier props.
- First-class function refs: `const handler = myFunc` in variable
  declarations emits FuncRef entries.
- Assignment targets: `const x = Foo()`, `x = Foo()`, and
  `const x = await Foo()` capture the variable name for type inference.
- Variable type annotations now capture variable_name into TypeRef.

Also adds `export default SomeName` handling for named default exports.
Enhance the call resolution pipeline with three major capabilities:

- Type inference: _build_type_table() constructs per-file {var: class}
  mappings from annotations, parameter types, and constructor calls.
  _resolve_receiver_method() uses this table to resolve `user.save()`
  to `User.save` when `user: User` or `user = User()` is known.

- Module-level calls: calls outside any function/method body are now
  attributed to the File node instead of being silently dropped. This
  captures top-level initialization patterns like `app = FastAPI()`.

- First-class function references: FuncRef entries (e.g. `handler = my_func`)
  create CALLS edges with a 0.7 confidence discount, connecting indirect
  references into the call graph.
Expand framework-aware entry point detection to cover:

- FastAPI HTTP verb decorators: @app.get, @app.post, @app.put,
  @app.delete, @app.patch (previously only @app.route was matched).
- Next.js data fetching functions: getServerSideProps, getStaticProps,
  getStaticPaths, generateStaticParams, generateMetadata.
- Alembic migrations: upgrade() and downgrade() in files under
  versions/ or migrations/ directories.

This reduces false positive entry points by correctly classifying
framework-registered functions rather than relying solely on the
"no incoming calls" heuristic.
Improve dead code detection accuracy with several new exemptions:

- Alembic migrations: upgrade() and downgrade() in migration files
  are never flagged as dead code.
- Pydantic decorators: computed_field, model_serializer, field_serializer
  added to framework decorator allowlist.
- EXTENDS relationship: classes with incoming EXTENDS edges are no longer
  flagged (base classes used via inheritance are alive).

Also aligns _is_test_file() path splitting with processes.py by using
string split instead of PurePosixPath for consistent cross-platform
behavior across the codebase.
Cover all new capabilities with focused test cases:

Parser tests:
- Python: assignment target extraction (sync/await), variable annotation
  names, func ref detection (with constant/private filtering)
- TypeScript: assignment targets, variable annotation names, func refs,
  JSX component calls, JSX callback props as func refs

Call resolution tests:
- Module-level calls attributed to File nodes
- Type inference via constructor (user = User()), annotation (user: User),
  and parameter (def f(user: User)) patterns
- Fallback behavior when no type info available
- Func ref edge creation with 0.7 confidence discount
- Blocklisted func refs create no edges

Dead code tests:
- Alembic upgrade/downgrade exemption in migration paths
- Non-migration upgrade() still flagged as dead
- Pydantic computed_field decorator exemption

Entry point tests:
- Alembic migration entry points
- FastAPI verb decorators (@app.get, @app.post)
- Next.js data fetching functions (getServerSideProps, generateStaticParams)
- Non-migration upgrade() not treated as entry point
Deduplicate is_alembic_migration() which was identically defined in
both dead_code.py and processes.py. Also move is_test_file() from
dead_code.py to the shared module since it was already imported
externally by mcp/tools.py with a private-name code smell.

Both functions now live in path_utils.py with public names, reducing
the risk of silent divergence if one copy is updated without the other.
@Zarvhos Zarvhos force-pushed the feat/improve-static-analysis-55 branch from 230cb7e to e1df828 Compare March 12, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant