feat(tree_path): add multi-language support with LanguageConfig refactor#2
Merged
feat(tree_path): add multi-language support with LanguageConfig refactor#2
Conversation
550690e to
2a5fecb
Compare
Refactor tree_path.rs to use a data-driven LanguageConfig struct instead of hardcoded Rust-specific logic. This enables adding new languages by only adding a new LanguageConfig constant and Cargo feature. Changes: - Add LanguageConfig struct with ts_language, extensions, kind_map, name_field, name_overrides, and body_fields - Convert Rust-specific KIND_MAP, node_name(), and body resolution to LanguageConfig methods - Update all resolve/compute functions to take &LanguageConfig - Use fn() -> TSLanguage for lazy language initialization All existing tests pass. No behavioral changes. Original prompt: > Let's first work on refactoring and multi-language support. Please > remember to commit frequently as you make changes. And you must respect > the commit message convention. Work in a new branch and PR when ready. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add tree_path support for Python via optional lang-python Cargo feature. Changes: - Add tree-sitter-python dependency (0.25.0) as optional feature - Add PYTHON_CONFIG LanguageConfig with function_definition and class_definition mappings - Update Language enum with Python variant (cfg-gated) - Update detect_language() to recognize .py and .pyi files - Add comprehensive Python tests (function, class, method resolution) Usage: cargo build --features lang-python Original prompt: > Let's first work on refactoring and multi-language support. Please > remember to commit frequently as you make changes. And you must respect > the commit message convention. Work in a new branch and PR when ready. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add tree_path support for Go via optional lang-go Cargo feature. Changes: - Add tree-sitter-go dependency (0.25.0) as optional feature - Add GO_CONFIG LanguageConfig with function_declaration and method_declaration mappings - Update Language enum with Go variant (cfg-gated) - Update detect_language() to recognize .go files - Add comprehensive Go tests (function, method resolution) Note: struct and interface types are not yet supported due to Go's nested type_declaration/type_spec AST structure. Usage: cargo build --features lang-go Original prompt: > Let's first work on refactoring and multi-language support. Please > remember to commit frequently as you make changes. And you must respect > the commit message convention. Work in a new branch and PR when ready. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add tree_path support for JavaScript and TypeScript via optional
Cargo features.
Changes:
- Add tree-sitter-javascript (0.25.0) and tree-sitter-typescript
(0.23.2) as optional dependencies
- Add JAVASCRIPT_CONFIG with function_declaration, class_declaration,
and method_definition mappings (.js, .mjs, .cjs, .jsx)
- Add TYPESCRIPT_CONFIG and TSX_CONFIG with additional interface,
type alias, and enum mappings (.ts, .mts, .cts, .tsx)
- Update Language enum with JavaScript, TypeScript, and Tsx variants
- Update detect_language() to recognize JS/TS file extensions
- Add comprehensive JS/TS tests for all node types
Features:
- lang-javascript: JavaScript support
- lang-typescript: TypeScript/TSX support (implies lang-javascript)
Usage: cargo build --features lang-javascript
cargo build --features lang-typescript
Original prompt:
> Let's first work on refactoring and multi-language support. Please
> remember to commit frequently as you make changes. And you must respect
> the commit message convention. Work in a new branch and PR when ready.
AI-assisted-by: Kimi K2.5 (OpenClaw)
Signed-off-by: WANG Xuerui <git@xen0n.name>
Make lang-python, lang-go, lang-javascript, and lang-typescript enabled by default. Users can opt out with --no-default-features if needed. Original prompt: > make all lang features default-enabled AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Fix clippy warning (collapsible_if) and run cargo fmt. Reanchor sidecar specs after tree_path.rs changes. Original prompt: > did you forget to check clippy, cargo fmt, and sync 立意? AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Remove #[cfg(feature = "...")] from Language enum variants to ensure API stability. The enum variants are now always present, but languages report whether they're supported via is_supported(). Changes: - Remove #[cfg] gates from Language enum variants - Add Language::is_supported() method for runtime feature checking - Change Language::config() to return Option<&LanguageConfig> - Change Language::ts_language() to return Option<TSLanguage> - Update make_parser(), resolve_tree_path(), compute_tree_path() to handle unsupported languages gracefully by returning None/empty - Update detect_language() to only return supported languages This ensures downstream code can match on all Language variants without conditional compilation, while still gracefully handling unsupported languages at runtime. Original prompt: > Can you adversarially review the PR branch changes? AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Make LanguageConfig fields private to hide implementation details and expose a cleaner public API. Changes: - Remove pub from all LanguageConfig fields - Add matches_extension(&self, ext: &str) -> bool public method - Update detect_language() to use the new method This prevents external code from depending on internal struct layout while still allowing the necessary operations. Original prompt: > Fix them one by one. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add documentation for known limitations and test coverage: - Document Go method naming collision in GO_CONFIG doc comment - Note that methods resolve as method::Name without receiver type disambiguation, which can cause tree_path collisions - Add TSX test module with tests for function, class, interface, and file extension detection Original prompt: > Fix them one by one. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add documentation to detect_language() explaining the behavior when two languages share an extension (first match wins). Original prompt: > Fix them one by one. AI-assisted-by: Kimi K2.5 (OpenClaw) Signed-off-by: WANG Xuerui <git@xen0n.name>
Drop Cargo feature gates for tree-sitter grammars — all five languages (Rust, Python, Go, JavaScript, TypeScript) are now compiled into the binary unconditionally. The binary-size cost is modest relative to the universality benefit; Python/Go/JS/TS codebases vastly outnumber Rust codebases and requiring opt-in per language would hinder adoption. Go tree_path support: - Add `custom_name` callback to `LanguageConfig` for languages with non-trivial name extraction. - Change `node_name` return type to `Cow<str>` to support both borrowed and owned (composite) names. - Encode method receivers: `method::(*Type).Method` (pointer) vs `method::Type.Method` (value). - Navigate type_declaration → type_spec, const_declaration → const_spec, var_declaration → var_spec indirection via custom_name. - Use unified `type` shorthand for structs, interfaces, and type aliases — Go type names are unique per package. Code changes (tree_path.rs, Cargo.toml): - Remove all #[cfg(feature = "...")] gates from statics, Language impl, detect_language, and test modules. - Make all tree-sitter-* dependencies unconditional; remove [features] section from Cargo.toml. Doc updates (liyi-design.md, liyi-01x-roadmap.md): - Update design doc: language support is built-in (not feature-gated), binary is ~6000 lines / 11 MiB (not "small"), remains single binary. - Update roadmap: mark M1 milestones complete, remove feature-gate references from headings and acceptance criteria, document resolved Go receiver encoding design. All 114 tests pass (90 unit + 20 golden + 4 proptest). Original prompt: > Review the current branch's changes against the roadmap and > design doc on the main branch. > > Regarding M1.3, this pattern is prevalent so support should be > added. Regarding conditional features, having them not built-in by > default would hinder adoption (orders of magnitude more than Rust > codebases in the wild), so to fulfill the project's promise as a > universal tool I'd suggest dropping "conditional" altogether in > the docs. The design doc may also need updating in that the linter > is already >6000 lines of Rust and a 11 MiB release build, by no > means "small". It is expected to remain as one binary, though. AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
Reanchor all specs after the LanguageConfig refactor and Go support addition. Fix two misidentified specs that reanchor shifted into wrong items: - KIND_MAP → LanguageConfig (struct replaced the static array) - node_name at matches_extension span → node_name at actual method span Update Language enum intent from "only Rust" to list all six built-in variants. Add go_node_name spec covering receiver encoding and type/const/var spec indirection. liyi check: 85 current, 0 stale, 0 shifted. Original prompt: > please sync 立意 AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
Add tree_path structural identity support for C, C++, Java, C#, PHP, Objective-C, Kotlin, and Swift. Each language gets a LanguageConfig with kind mappings and, where needed, custom name extraction callbacks: - C/C++: declarator-chain unwrapping for function_definition, C++ adds template_declaration transparency and alias_declaration - Objective-C: class_interface/implementation/protocol name extraction, selector composition for methods - Kotlin: property_declaration and type_alias name extraction, class_body positional-child handling in find_body() - PHP: const_declaration name via const_element child - Java/C#/Swift: standard field-based extraction (no custom callback needed) Extends detect_language() with all new file extensions. Generalizes find_body() to search body_fields as child node kinds (not just field names), enabling Kotlin class_body and C++ field_declaration_list. All 103 tree_path tests pass, including 8 new per-language test modules. Full test suite (unit, golden, proptest) green. Updates M2 section of docs/liyi-01x-roadmap.md from placeholder "Deferred languages" to comprehensive documentation of all 8 language integrations. Original prompt: > Let's build them into the roadmap docs and implement. > (Referring to C, C++, Objective-C, Java, C#, PHP, Kotlin, Swift > tree-sitter language support for tree_path.) AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
Extract each language configuration into its own file under
crates/liyi/src/tree_path/:
mod.rs – core infrastructure (LanguageConfig, Language enum,
detect_language, resolve/compute functions) — 752 lines
lang_rust.rs – Rust config
lang_python.rs – Python config + tests
lang_go.rs – Go config + go_node_name callback + tests
lang_c.rs – C config + c_extract_declarator_name (shared) + tests
lang_cpp.rs – C++ config + tests (imports c_extract_declarator_name)
lang_objc.rs – Objective-C config + tests (imports c_extract_declarator_name)
lang_java.rs – Java config + tests
lang_csharp.rs – C# config + tests
lang_php.rs – PHP config + php_node_name callback + tests
lang_kotlin.rs – Kotlin config + kotlin_node_name callback + tests
lang_swift.rs – Swift config + tests
lang_javascript.rs – JavaScript config + tests
lang_typescript.rs – TypeScript + TSX configs + tests
No behavioral changes. All 168 tests pass (144 unit + 20 golden + 4 proptest).
Sidecar moved to tree_path/mod.rs.liyi.jsonc and reanchored (85 current).
Original prompt:
> The tree_path module is getting large, please refactor so
> every language lives its own file and commit.
AI-assisted-by: Claude Opus 4.6 (GitHub Copilot)
Signed-off-by: WANG Xuerui <git@xen0n.name>
The linter now has 14 tree-sitter grammars and ~7 k lines of Rust. Update the "self-contained" bullet and the reimplementation cost estimate to reflect the current state. Original prompt: > Regarding the design doc (line 1900 and 2034) -- the linter is now > 7348 lines of Rust and 33 MiB, arguably not "lightweight" any more. > We need to reword a bit. > > For line 1900, I think this version is still a bit too much. Just > "The linter is a single binary with tree-sitter grammars built in, > no runtime dependencies" would be enough, because the size or > complexity doesn't matter for self-containedness. Human note: added the second turn. AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
Apply cargo fmt across tree_path/lang_*.rs files and fix two clippy warnings: - collapsible_if in tree_path/mod.rs (nested if let → combined chain) - cloned_ref_to_slice_refs in discovery.rs (&[sub.clone()] → from_ref) Reanchor mod.rs.liyi.jsonc after the code changes. Original-prompt: please fix cargo fmt and cargo clippy and sync sidecars AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
2a5fecb to
11c6d53
Compare
Scaffold and populate .liyi.jsonc sidecars for 13 language support files (lang_c, lang_cpp, lang_csharp, lang_go, lang_java, lang_javascript, lang_kotlin, lang_objc, lang_php, lang_python, lang_rust, lang_swift, lang_typescript) with intent specs for all non-trivial items (CONFIG statics, custom name extractors). Original prompt: > okay, commit them first. also don't the newly added language > support files need sidecars? if you decide to add, do it with > subagents Human note: fixed formatting and content of "original prompt". AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
The bullet said "the linter doesn't parse source code", but since semantic anchors (tree_path) were added, the linter does parse source via tree-sitter. Reword to say only the checking process skips parsing. Original prompt: > In line 1902, "the linter doesn't parse source code" -- this is > no longer true, since support for semantic anchors was added. Please > reword to say only the checking process doesn't parse code. Resync > sidecar and commit. AI-assisted-by: Claude Opus 4.6 (GitHub Copilot) Signed-off-by: WANG Xuerui <git@xen0n.name>
11c6d53 to
de7ea3a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds support for Python, Go, JavaScript, and TypeScript tree_path resolution via a refactored LanguageConfig abstraction.
Changes
refactor(tree_path): extract LanguageConfig abstraction
LanguageConfigstruct to replace hardcoded Rust-specific logicts_language,extensions,kind_map,name_field,name_overrides,body_fieldsLanguageConfigconstant — no changes to resolve/compute logic neededfeat: Python support (
lang-pythonfeature)function_definition,class_definition.py,.pyifn::name,class::Name,class::Name::fn::methodfeat: Go support (
lang-gofeature)function_declaration,method_declaration.gofn::Name,method::Namefeat: JavaScript/TypeScript support (
lang-javascript,lang-typescriptfeatures)function_declaration,class_declaration,method_definition(.js, .mjs, .cjs, .jsx)interface_declaration,type_alias_declaration,enum_declaration(.ts, .mts, .cts).tsxfilesfn::name,class::Name,class::Name::method::name,interface::Name,type::Name,enum::NameUsage
Testing
All 55+ tests pass with all features enabled.
Roadmap Alignment
Implements M1.1–M1.5 from
docs/liyi-01x-roadmap.md:LanguageConfigrefactor