Skip to content

fix(extract): resolve TS/JS barrel re-exports and extensionless imports#403

Open
abhishekkolge-design wants to merge 1 commit intosafishamsi:mainfrom
abhishekkolge-design:fix/ts-barrel-reexport-and-index-resolution
Open

fix(extract): resolve TS/JS barrel re-exports and extensionless imports#403
abhishekkolge-design wants to merge 1 commit intosafishamsi:mainfrom
abhishekkolge-design:fix/ts-barrel-reexport-and-index-resolution

Conversation

@abhishekkolge-design
Copy link
Copy Markdown

Problem

Two gaps in the JS/TS AST extractor caused barrel-exported symbols to appear as isolated nodes with no inbound edges in the graph, even when they were actively imported by other files.

Gap 1 — Re-export statements were invisible

export * from './X' and export { A } from './X' produce export_statement nodes in the tree-sitter AST. These were never in import_types, so the walk() function never processed them. In any TypeScript project using barrel index.ts files, every component re-exported through a barrel appeared as a disconnected island.

Gap 2 — Extensionless imports didn't resolve to real file IDs

import { X } from './components' (no extension) resolved to a target node ID for the bare path (e.g. …_components) which never matched any actual file node (e.g. …_components_index_ts). Same issue for import { Y } from './UnverifiedEmailBanner' — the target ID lacked the _tsx suffix.

Fix

1. LanguageConfig — new reexport_types field

reexport_types: frozenset = frozenset()  # e.g. {"export_statement"} for JS/TS barrel re-exports

2. walk() — re-export handling

After the import_types block, check for re-export nodes. Only triggers when the node has a from child keyword — this distinguishes export * from './X' from regular export const/function/class declarations, which must still walk their children for node extraction.

if config.reexport_types and t in config.reexport_types:
    has_from = any(c.type == "from" for c in node.children)
    if has_from and config.import_handler:
        config.import_handler(node, source, file_nid, stem, edges, str_path)
        return
    # Regular exports fall through to walk children normally

3. _import_js() — extensionless + index file resolution

if not resolved.suffix:
    for ext in (".ts", ".tsx", ".js", ".jsx"):
        candidate = resolved.with_suffix(ext)
        if candidate.exists():
            resolved = candidate
            break
    else:
        for ext in (".ts", ".tsx", ".js", ".jsx"):
            candidate = resolved / ("index" + ext)
            if candidate.exists():
                resolved = candidate
                break

4. Enabled for _JS_CONFIG and _TS_CONFIG

reexport_types=frozenset({"export_statement"}),

Verification

Tested on a large TypeScript monorepo (~2,800 files, Next.js + React). The chain:

NavigationSidebar.tsx  →  components/index.ts  →  UnverifiedEmailBanner.tsx

Previously: all three nodes appeared isolated (0 cross-file edges between them).
After fix: NavigationSidebar.tsx --imports_from→ index.ts --imports_from→ UnverifiedEmailBanner.tsx

The graph went from treating every barrel-exported component as a disconnected island to correctly tracing the full import chain through index.ts files.

🤖 Generated with Claude Code

@vhsantos26
Copy link
Copy Markdown

Empirical signal that this fix matters in practice: running graphify on a TS codebase with path aliases gives imports_ratio: 0.0%. On a codebase without aliases but with .mjs relative imports, it's 11.6%. Both cases point to the same class of problem this PR addresses — silent dropping of import edges when target IDs don't match file node IDs.

The extensionless + barrel resolution logic here should recover a significant portion of the missing signal, especially in monorepos with barrel index.ts files.

PR is large — if it would help review, consider splitting Gap 1 (re-export handling) and Gap 2 (extensionless resolution) into separate commits or PRs? Happy to help benchmark either path.

@abhishekkolge-design
Copy link
Copy Markdown
Author

@vhsantos26
Thanks for the review, the imports_ratio datapoints were really helpful for validation.

the PR isn’t just the fixes, it’s a refactor plus the fixes bundled together.

The bulk of the diff comes from the refactor. Specifically, it consolidates the per language extractors (extract_python, extract_js_ts, extract_go, etc.) into a shared LanguageConfig abstraction with a walk() dispatcher.

The actual fix logic is quite small roughly 25 lines total:

Gap 1: Adds a reexport_types field to LanguageConfig and handles it in walk()
Gap 2: Adds extensionless resolution + index.{ts,tsx,js,jsx} handling in the new _import_js helper

Both fixes rely on the abstractions introduced in the refactor, which is why they ended up in the same PR.

If you’d prefer a cleaner review flow, the proper split would be:

  1. Refactor (introduce LanguageConfig + walk())
  2. Gap 1 fix
  3. Gap 2 fix

I’ve already prepared this as stacked PRs locally and verified that the final result is byte-identical to the current PR. Happy to push that version, or keep this as-is and expand the PR description to better explain the refactor scope, whatever you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants