Skip to content

feat(parsers): accelerate new language support using ast-grep patterns #414

@vitali87

Description

@vitali87

Summary

Use ast-grep's pattern syntax to simplify adding new language support to CGR, reducing per-language implementation effort by creating a generic handler that uses YAML pattern configs instead of hand-written tree-sitter traversal code.

Motivation

CGR fully supports 7 languages but has 4 more "in development" (Go, Scala, C#, PHP) and 20+ languages that ast-grep supports but CGR does not (Ruby, Kotlin, Swift, Bash, C, etc.). Currently each language needs a LanguageSpec, LanguageHandler, and FQNSpec with hundreds of lines of tree-sitter traversal code.

ast-grep patterns like def $FUNC($$$ARGS): $$$BODY can replace complex tree-sitter query + traversal logic for basic extraction. A "basic language support" tier using ast-grep patterns could cover function/class/import extraction without a full handler.

Implementation

  • codebase_rag/parsers/handlers/ast_grep_handler.py (~200 lines) implementing LanguageHandler protocol using ast-grep patterns instead of tree-sitter traversal
  • Language pattern configs in YAML (one per language) defining function/class/import patterns
  • Register new handler as fallback in registry.py for languages without specialized handlers
  • Use this approach to finish Go, C#, PHP, Scala support
  • Provide a template for community contributors to add new languages easily

Acceptance Criteria

  • Generic AstGrepHandler implementing LanguageHandler protocol
  • YAML-based pattern config format documented and validated
  • At least one language (Go or C#) fully supported via this approach
  • Handler registered as fallback in parser registry
  • Extracted entities (functions, classes, imports) match quality of hand-written handlers for basic use cases
  • Template/guide for adding new language support via YAML config
  • Unit tests comparing ast-grep handler output vs existing handlers for supported languages

Related

Part of the ast-grep integration initiative:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestlanguage supportRelated to programming language support

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions