Conversation
📝 WalkthroughWalkthroughThe symbol table cache now implements incremental invalidation by tracking per-file modification times. Changed files are detected via metadata comparison, removed from the cache, and re-analyzed; deleted files are identified and pruned. The cache persists only when updates occur. Changes
Sequence Diagram(s)sequenceDiagram
participant Builder as SymbolTableBuilder
participant FS as Filesystem
participant Cache as Cache Storage
participant SymTable as SymbolTable
Builder->>Cache: Load cached SymbolTable & metadata
activate Cache
Cache-->>Builder: Return cached data
deactivate Cache
Builder->>FS: Scan *.py files in project
activate FS
FS-->>Builder: Return file list & current mtimes
deactivate FS
loop For each scanned file
alt File unchanged (mtime matches metadata)
Builder->>SymTable: Skip (use cached symbols)
else File changed or new
Builder->>SymTable: remove_file(file_path)
Builder->>FS: Re-analyze file
Builder->>SymTable: Add updated symbols
Builder->>SymTable: Update metadata[file_path]
end
end
loop Detect deleted files
Builder->>SymTable: Check cached symbols vs current files
alt File no longer exists
Builder->>SymTable: remove_file(file_path)
end
end
alt Any updates made
Builder->>Cache: Save updated SymbolTable with metadata
activate Cache
Cache-->>Builder: Cache persisted
deactivate Cache
else No updates
Builder->>Builder: Skip cache save
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
28f151f to
18c8108
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
refactron/analysis/symbol_table.py (3)
180-180: Redundantgetattrusage.
self.cache_diris always initialized in__init__(line 136), sogetattr(self, "cache_dir", None)is unnecessary. Usingself.cache_dirdirectly is cleaner and more consistent with line 142.♻️ Suggested fix
- if getattr(self, "cache_dir", None) and updated: + if self.cache_dir and updated:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@refactron/analysis/symbol_table.py` at line 180, The condition uses getattr(self, "cache_dir", None) even though cache_dir is always set in __init__; replace the getattr call with a direct attribute check (use self.cache_dir) in the conditional inside the method where updated is checked (the if statement currently written as getattr(self, "cache_dir", None) and updated). Update the condition to use self.cache_dir and keep the rest of the logic unchanged so it matches the style used elsewhere (e.g., the direct access on line 142).
72-76: Unused loop variablesymbol.The
symbolvariable from the iteration is not used; onlynameis needed for the exports lookup. Static analysis flagged this (B007).♻️ Suggested fix
for scope, names in self.symbols[file_path].items(): if scope == "global": - for name, symbol in list(names.items()): + for name in list(names.keys()): if name in self.exports and self.exports[name].file_path == file_path: del self.exports[name]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@refactron/analysis/symbol_table.py` around lines 72 - 76, The loop declares an unused variable `symbol` in "for name, symbol in list(names.items())" which triggers a linter warning; change the loop to only use the name (e.g., "for name in list(names.keys())" or "for name, _ in list(names.items())") so the unused variable is removed, keeping the existing logic that checks self.exports[name].file_path == file_path and deletes from self.exports.
149-151: Line exceeds 100-character limit.This list comprehension line appears to exceed the 100-character limit specified in the coding guidelines.
♻️ Suggested fix
- python_files = [ - f for f in all_python_files if not any(excluded in f.parts for excluded in excluded_dirs) - ] + python_files = [ + f for f in all_python_files + if not any(excluded in f.parts for excluded in excluded_dirs) + ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@refactron/analysis/symbol_table.py` around lines 149 - 151, The list comprehension assigning python_files is over 100 chars; split the expression across multiple lines or extract the condition into a named helper to stay within the limit. For example, move the any(...) check into a small function or a separate variable (e.g., is_excluded = lambda p: any(excluded in p.parts for excluded in excluded_dirs)) and then build python_files = [f for f in all_python_files if not is_excluded(f)] so the line lengths for python_files, the condition, and the any(...) call are each under 100 characters; references: python_files, all_python_files, excluded_dirs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@refactron/verification/checks/syntax.py`:
- Around line 1-26: The new fast-path must be merged into the existing
SyntaxVerifier without changing the BaseCheck contract: update the existing
SyntaxVerifier.verify method to keep the signature verify(self, original: str,
transformed: str, file_path: Path) -> CheckResult, and implement the
VerificationContext optimization inside that method by retrieving or accepting a
VerificationContext instance from the surrounding engine (use the existing
engine/context helper rather than adding a new optional parameter); if a cached
context indicates transformed_ast is None and transformed.strip() != "" return a
failing CheckResult, otherwise attempt ast.parse(transformed) and return the
appropriate CheckResult on parse failure or success; also wrap/shorten the long
import/type line to <=100 chars and ensure you reference the types
VerificationContext and CheckResult where required.
---
Nitpick comments:
In `@refactron/analysis/symbol_table.py`:
- Line 180: The condition uses getattr(self, "cache_dir", None) even though
cache_dir is always set in __init__; replace the getattr call with a direct
attribute check (use self.cache_dir) in the conditional inside the method where
updated is checked (the if statement currently written as getattr(self,
"cache_dir", None) and updated). Update the condition to use self.cache_dir and
keep the rest of the logic unchanged so it matches the style used elsewhere
(e.g., the direct access on line 142).
- Around line 72-76: The loop declares an unused variable `symbol` in "for name,
symbol in list(names.items())" which triggers a linter warning; change the loop
to only use the name (e.g., "for name in list(names.keys())" or "for name, _ in
list(names.items())") so the unused variable is removed, keeping the existing
logic that checks self.exports[name].file_path == file_path and deletes from
self.exports.
- Around line 149-151: The list comprehension assigning python_files is over 100
chars; split the expression across multiple lines or extract the condition into
a named helper to stay within the limit. For example, move the any(...) check
into a small function or a separate variable (e.g., is_excluded = lambda p:
any(excluded in p.parts for excluded in excluded_dirs)) and then build
python_files = [f for f in all_python_files if not is_excluded(f)] so the line
lengths for python_files, the condition, and the any(...) call are each under
100 characters; references: python_files, all_python_files, excluded_dirs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e13da794-0cb8-41b3-80f4-1ed88f9154ae
📒 Files selected for processing (4)
refactron/analysis/symbol_table.pyrefactron/verification/checks/imports.pyrefactron/verification/checks/syntax.pyrefactron/verification/engine.py
| import ast | ||
| from typing import Optional | ||
| from refactron.verification.engine import BaseCheck, VerificationContext | ||
|
|
||
| class SyntaxVerifier(BaseCheck): | ||
| def verify(self, original: str, transformed: str, context: Optional[VerificationContext] = None) -> bool: | ||
| """ | ||
| Verify the syntax of the transformed code. | ||
| Uses shared VerificationContext to avoid redundant AST parsing. | ||
| """ | ||
| # If the VerificationEngine already tried to parse it and failed, the syntax implies it's broken | ||
| if context and context.transformed_ast is None and transformed.strip() != "": | ||
| return False | ||
|
|
||
| # Optional: libcst parsing could also be cached, but for now we fallback | ||
| # or rely strictly on ast if libcst isn't required by the context immediately | ||
|
|
||
| # Fallback to local parsing for backward compatibility | ||
| # (e.g. if run independently without the pre-configured context) | ||
| if not context: | ||
| try: | ||
| ast.parse(transformed) | ||
| except Exception: | ||
| return False | ||
|
|
||
| return True |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's see the full file structure to confirm the duplicate
cat -n refactron/verification/checks/syntax.py | head -50Repository: Refactron-ai/Refactron_lib
Length of output: 1971
🏁 Script executed:
# Check the BaseCheck abstract class contract
cat -n refactron/verification/engine.py | grep -A 15 "class BaseCheck"Repository: Refactron-ai/Refactron_lib
Length of output: 585
🏁 Script executed:
# Check the result.py for CheckResult and VerificationResult definitions
cat -n refactron/verification/result.py | head -30Repository: Refactron-ai/Refactron_lib
Length of output: 1078
🏁 Script executed:
# Check line length issues with the current snippet
python3 << 'EOF'
code = '''import ast
from typing import Optional
from refactron.verification.engine import BaseCheck, VerificationContext
class SyntaxVerifier(BaseCheck):
def verify(self, original: str, transformed: str, context: Optional[VerificationContext] = None) -> bool:
"""
Verify the syntax of the transformed code.
Uses shared VerificationContext to avoid redundant AST parsing.
"""
# If the VerificationEngine already tried to parse it and failed, the syntax implies it's broken
if context and context.transformed_ast is None and transformed.strip() != "":
return False
# Optional: libcst parsing could also be cached, but for now we fallback
# or rely strictly on ast if libcst isn't required by the context immediately
# Fallback to local parsing for backward compatibility
# (e.g. if run independently without the pre-configured context)
if not context:
try:
ast.parse(transformed)
except Exception:
return False
return True'''
for i, line in enumerate(code.split('\n'), 1):
if len(line) > 100:
print(f"Line {i} ({len(line)} chars): {line[:100]}...")
EOFRepository: Refactron-ai/Refactron_lib
Length of output: 318
Merge this fast-path into the existing SyntaxVerifier instead of creating a duplicate.
The existing SyntaxVerifier is at line 12, not line 38. This proposed implementation would shadow it and violates the BaseCheck contract: verify() must accept file_path: Path and return CheckResult, not context: Optional[VerificationContext] and bool. This mismatch causes the linting failures (F811/E402/no-redef) and breaks the verification pipeline.
Additionally, line 6 exceeds the 100-character limit (109 chars).
Please refactor the VerificationContext optimization into the existing class while preserving the correct (original, transformed, file_path: Path) -> CheckResult signature per repo standards.
🧰 Tools
🪛 GitHub Actions: Pre-commit
[error] 15-15: flake8 (E501) line too long (104 > 100 characters)
🪛 Ruff (0.15.9)
[warning] 23-23: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@refactron/verification/checks/syntax.py` around lines 1 - 26, The new
fast-path must be merged into the existing SyntaxVerifier without changing the
BaseCheck contract: update the existing SyntaxVerifier.verify method to keep the
signature verify(self, original: str, transformed: str, file_path: Path) ->
CheckResult, and implement the VerificationContext optimization inside that
method by retrieving or accepting a VerificationContext instance from the
surrounding engine (use the existing engine/context helper rather than adding a
new optional parameter); if a cached context indicates transformed_ast is None
and transformed.strip() != "" return a failing CheckResult, otherwise attempt
ast.parse(transformed) and return the appropriate CheckResult on parse failure
or success; also wrap/shorten the long import/type line to <=100 chars and
ensure you reference the types VerificationContext and CheckResult where
required.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
refactron/analysis/symbol_table.py (1)
67-301:⚠️ Potential issue | 🟡 MinorCI is currently blocked by formatting hooks.
Pre-commit reports trailing whitespace and black reformatting for this file. Please run formatting hooks before merge to unblock pipeline.
As per coding guidelines, "
**/*.py: Use black formatter with target-version set to py38, py39, py310, py311" and "Flake8 linting must use max-line-length of 100".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@refactron/analysis/symbol_table.py` around lines 67 - 301, The file has trailing whitespace and needs Black/Flake8 formatting; run the repo pre-commit hooks (or run black and flake8 manually) to fix whitespace and style issues: remove trailing whitespace and run black with target-version options for py38, py39, py310, py311, then run flake8 with max-line-length=100 to apply lint rules; re-run the tests/CI to ensure hooks pass and commit the reformatted changes (this affects functions like remove_file, add_symbol, build_for_project, _save_cache, and _load_cache referenced in the diff).
🧹 Nitpick comments (1)
refactron/analysis/symbol_table.py (1)
183-184: Narrow cache-save exception handling and keep traceback context.Line 183 catches
Exceptionbroadly. Prefer explicit exception classes and structured logging with traceback (exc_info) for diagnosability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@refactron/analysis/symbol_table.py` around lines 183 - 184, The broad except in build_for_project (the except Exception as e around the cache save) should be narrowed and include traceback context: catch only likely I/O/serialization errors (e.g., OSError/IOError and pickle.PicklingError or the specific serializer error used) instead of Exception, and log the failure with exc_info=True (or use logger.exception) so the traceback is preserved; update the import list if you reference pickle.PicklingError or other specific exception classes and replace the current logger.warning(...) line with the narrowed except block and structured logging call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@refactron/analysis/symbol_table.py`:
- Around line 170-172: The metadata advance is happening regardless of whether
_analyze_file succeeded; modify the flow so that symbol_table.metadata[file_str]
= mtime and updated = True are only executed after a successful run of
_analyze_file (i.e., no exception was raised). Concretely, in the blocks that
call _analyze_file (referencing _analyze_file, file_str, mtime,
symbol_table.metadata, and the updated flag), move the metadata assignment and
updated=True into the success path (or after try/except only if no exception
occurred) and ensure caught exceptions do not update metadata so failed analyses
will be retried later; apply the same change for both occurrences that wrap
_analyze_file (the first block and the block around lines 188-199).
- Around line 67-80: The early return in remove_file prevents cleaning up
self.metadata for files that had no extracted symbols; update remove_file (in
symbol_table.SymbolTable) to always delete metadata for file_path even when
file_path not in self.symbols—either remove the early return and wrap the
symbol-specific cleanup in an if block (if file_path in self.symbols: ...del
self.symbols[file_path]) or keep the early return but perform metadata deletion
before returning (if file_path not in self.symbols: if file_path in
self.metadata: del self.metadata[file_path]; return); preserve the existing
export-cleanup logic for files that do exist in self.symbols.
- Around line 64-65: The metadata cache currently types values as Dict[str,
float] and uses st_mtime (float); change the metadata annotation to Dict[str,
int], replace any os.stat(...).st_mtime reads with os.stat(...).st_mtime_ns so
stored mtimes are integer nanoseconds, and update the deserialization path that
loads saved metadata to convert values to ints (e.g., {k: int(v) for k, v in
data.get("metadata", {}).items()}) so comparisons use integer ns timestamps.
Ensure you update the declaration of the metadata field, the place(s) that
set/check file mtime (use st_mtime_ns), and the load/deserialize code that
reconstructs metadata from persisted data.
---
Outside diff comments:
In `@refactron/analysis/symbol_table.py`:
- Around line 67-301: The file has trailing whitespace and needs Black/Flake8
formatting; run the repo pre-commit hooks (or run black and flake8 manually) to
fix whitespace and style issues: remove trailing whitespace and run black with
target-version options for py38, py39, py310, py311, then run flake8 with
max-line-length=100 to apply lint rules; re-run the tests/CI to ensure hooks
pass and commit the reformatted changes (this affects functions like
remove_file, add_symbol, build_for_project, _save_cache, and _load_cache
referenced in the diff).
---
Nitpick comments:
In `@refactron/analysis/symbol_table.py`:
- Around line 183-184: The broad except in build_for_project (the except
Exception as e around the cache save) should be narrowed and include traceback
context: catch only likely I/O/serialization errors (e.g., OSError/IOError and
pickle.PicklingError or the specific serializer error used) instead of
Exception, and log the failure with exc_info=True (or use logger.exception) so
the traceback is preserved; update the import list if you reference
pickle.PicklingError or other specific exception classes and replace the current
logger.warning(...) line with the narrowed except block and structured logging
call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df6dac6b-b756-4d21-97d4-82f104f50b8c
📒 Files selected for processing (1)
refactron/analysis/symbol_table.py
| # Metadata for caching (Map: file_path -> mtime) | ||
| metadata: Dict[str, float] = field(default_factory=dict) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify all mtime usage points to ensure precision consistency.
rg -n --type=py -C2 '\bst_mtime\b|\bst_mtime_ns\b'Repository: Refactron-ai/Refactron_lib
Length of output: 2427
🏁 Script executed:
cat -n refactron/analysis/symbol_table.py | head -80Repository: Refactron-ai/Refactron_lib
Length of output: 3232
🏁 Script executed:
sed -n '155,170p' refactron/analysis/symbol_table.pyRepository: Refactron-ai/Refactron_lib
Length of output: 647
🏁 Script executed:
sed -n '260,275p' refactron/analysis/symbol_table.pyRepository: Refactron-ai/Refactron_lib
Length of output: 754
🏁 Script executed:
sed -n '295,310p' refactron/analysis/symbol_table.pyRepository: Refactron-ai/Refactron_lib
Length of output: 552
Use st_mtime_ns (integer nanoseconds) instead of st_mtime (float seconds) for reliable cache invalidation.
The current code uses st_mtime (float seconds) and compares with ==, which can miss rapid file edits on filesystems with coarse timestamp resolution (e.g., 1-second granularity on some systems). Switching to st_mtime_ns (integer nanoseconds) provides sub-microsecond precision and avoids floating-point comparison issues.
Update:
- Line 65: Change
metadata: Dict[str, float]toDict[str, int] - Line 159: Change
st_mtimetost_mtime_ns - Line 301: Convert deserialized values with
{k: int(v) for k, v in data.get("metadata", {}).items()}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@refactron/analysis/symbol_table.py` around lines 64 - 65, The metadata cache
currently types values as Dict[str, float] and uses st_mtime (float); change the
metadata annotation to Dict[str, int], replace any os.stat(...).st_mtime reads
with os.stat(...).st_mtime_ns so stored mtimes are integer nanoseconds, and
update the deserialization path that loads saved metadata to convert values to
ints (e.g., {k: int(v) for k, v in data.get("metadata", {}).items()}) so
comparisons use integer ns timestamps. Ensure you update the declaration of the
metadata field, the place(s) that set/check file mtime (use st_mtime_ns), and
the load/deserialize code that reconstructs metadata from persisted data.
| def remove_file(self, file_path: str) -> None: | ||
| """Remove all symbols associated with a specific file.""" | ||
| if file_path not in self.symbols: | ||
| return | ||
|
|
||
| for scope, names in self.symbols[file_path].items(): | ||
| if scope == "global": | ||
| for name, symbol in list(names.items()): | ||
| if name in self.exports and self.exports[name].file_path == file_path: | ||
| del self.exports[name] | ||
|
|
||
| del self.symbols[file_path] | ||
| if file_path in self.metadata: | ||
| del self.metadata[file_path] |
There was a problem hiding this comment.
Fix stale metadata cleanup in remove_file.
At Line 69, the early return prevents metadata cleanup for files that had no extracted symbols. Deleted symbol-less files can remain forever in metadata.
Proposed fix
def remove_file(self, file_path: str) -> None:
"""Remove all symbols associated with a specific file."""
- if file_path not in self.symbols:
- return
-
- for scope, names in self.symbols[file_path].items():
- if scope == "global":
- for name, symbol in list(names.items()):
- if name in self.exports and self.exports[name].file_path == file_path:
- del self.exports[name]
-
- del self.symbols[file_path]
- if file_path in self.metadata:
- del self.metadata[file_path]
+ if file_path in self.symbols:
+ for scope, names in self.symbols[file_path].items():
+ if scope == "global":
+ for name in list(names):
+ if name in self.exports and self.exports[name].file_path == file_path:
+ del self.exports[name]
+ del self.symbols[file_path]
+ self.metadata.pop(file_path, None)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def remove_file(self, file_path: str) -> None: | |
| """Remove all symbols associated with a specific file.""" | |
| if file_path not in self.symbols: | |
| return | |
| for scope, names in self.symbols[file_path].items(): | |
| if scope == "global": | |
| for name, symbol in list(names.items()): | |
| if name in self.exports and self.exports[name].file_path == file_path: | |
| del self.exports[name] | |
| del self.symbols[file_path] | |
| if file_path in self.metadata: | |
| del self.metadata[file_path] | |
| def remove_file(self, file_path: str) -> None: | |
| """Remove all symbols associated with a specific file.""" | |
| if file_path in self.symbols: | |
| for scope, names in self.symbols[file_path].items(): | |
| if scope == "global": | |
| for name in list(names): | |
| if name in self.exports and self.exports[name].file_path == file_path: | |
| del self.exports[name] | |
| del self.symbols[file_path] | |
| self.metadata.pop(file_path, None) |
🧰 Tools
🪛 Ruff (0.15.9)
[warning] 74-74: Loop control variable symbol not used within loop body
Rename unused symbol to _symbol
(B007)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@refactron/analysis/symbol_table.py` around lines 67 - 80, The early return in
remove_file prevents cleaning up self.metadata for files that had no extracted
symbols; update remove_file (in symbol_table.SymbolTable) to always delete
metadata for file_path even when file_path not in self.symbols—either remove the
early return and wrap the symbol-specific cleanup in an if block (if file_path
in self.symbols: ...del self.symbols[file_path]) or keep the early return but
perform metadata deletion before returning (if file_path not in self.symbols: if
file_path in self.metadata: del self.metadata[file_path]; return); preserve the
existing export-cleanup logic for files that do exist in self.symbols.
| self._analyze_file(file_path) | ||
| self.symbol_table.metadata[file_str] = mtime | ||
| updated = True |
There was a problem hiding this comment.
Don’t advance metadata when analysis fails.
At Line 171, metadata is updated even if _analyze_file failed (Line 197-198 catches and suppresses). That can lock in missing symbols and skip future retries until file mtime changes again.
Proposed fix
- self._analyze_file(file_path)
- self.symbol_table.metadata[file_str] = mtime
+ analyzed = self._analyze_file(file_path)
+ if analyzed:
+ self.symbol_table.metadata[file_str] = mtime
+ else:
+ self.symbol_table.metadata.pop(file_str, None)
updated = True
...
- def _analyze_file(self, file_path: Path) -> None:
+ def _analyze_file(self, file_path: Path) -> bool:
"""Analyze a single file and populate symbols."""
try:
# We use astroid for better inference capabilities later
tree = self.inference_engine.parse_file(str(file_path))
@@
# Walk the tree
self._visit_node(tree, str(file_path), "global")
+ return True
except Exception as e:
logger.warning(f"Failed to build symbol table for {file_path}: {e}")
+ return FalseAlso applies to: 188-199
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@refactron/analysis/symbol_table.py` around lines 170 - 172, The metadata
advance is happening regardless of whether _analyze_file succeeded; modify the
flow so that symbol_table.metadata[file_str] = mtime and updated = True are only
executed after a successful run of _analyze_file (i.e., no exception was
raised). Concretely, in the blocks that call _analyze_file (referencing
_analyze_file, file_str, mtime, symbol_table.metadata, and the updated flag),
move the metadata assignment and updated=True into the success path (or after
try/except only if no exception occurred) and ensure caught exceptions do not
update metadata so failed analyses will be retried later; apply the same change
for both occurrences that wrap _analyze_file (the first block and the block
around lines 188-199).
solve #149
scanning process to make it significantly faster and more resource-efficient. Previously, the system blindly scanned the entire project tree, often wasting valuable compute cycles indexing virtual environments, git histories, and cache directories. We resolved this by implementing a targeted directory filter that actively excludes irrelevant folders such as .git, .venv, pycache, and node_modules during the file discovery phase. In addition to minimizing the search area, we introduced a robust incremental caching mechanism that intelligently tracks each file's modification timestamp (mtime) using a new persistent metadata dictionary. Rather than executing an expensive, all-or-nothing cache reload, the builder now loads the existing memory table and precisely targets only the files that have been created, modified, or deleted since the last run. When a file change is detected, the old footprint—including any globally exported references—is cleanly purged before the newly updated contents are re-analyzed. This ensures that the application's underlying symbol table continuously runs strictly in sync with raw disk changes without forcing users to pay the performance cost of a full systemic rebuild.
Summary by CodeRabbit