Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 126 additions & 33 deletions refactron/analysis/symbol_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@
Maps classes, functions, variables, and their relationships across the codebase.
"""

import hashlib
import json
import logging
from dataclasses import dataclass, field
from enum import Enum
from pathlib import Path
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Set

from refactron.core.inference import InferenceEngine

Expand Down Expand Up @@ -61,30 +62,59 @@ class SymbolTable:
symbols: Dict[str, Dict[str, Dict[str, Symbol]]] = field(default_factory=dict)
# Map: global_name -> Symbol (for easy cross-file lookup of exports)
exports: Dict[str, Symbol] = field(default_factory=dict)
# Map: file_path -> { "mtime": float, "size": int, "sha256": str }
file_metadata: Dict[str, Dict[str, Any]] = field(default_factory=dict)

Comment on lines +65 to 67
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify Any-typed metadata declaration
rg -n 'file_metadata:\s*Dict\[str,\s*Dict\[str,\s*Any\]\]' refactron/analysis/symbol_table.py

# Verify bool-returning function currently returns expressions based on Any lookups
rg -n 'return current_hash != stored_hash|return stat\.st_mtime != metadata\.get\("mtime"\)' \
  refactron/analysis/symbol_table.py

Repository: Refactron-ai/Refactron_lib

Length of output: 264


🏁 Script executed:

# View the import statements and file_metadata declaration (around line 66)
head -70 refactron/analysis/symbol_table.py | tail -25

Repository: Refactron-ai/Refactron_lib

Length of output: 1035


🏁 Script executed:

# View the function containing the problematic return statements (around lines 176-189)
sed -n '170,195p' refactron/analysis/symbol_table.py

Repository: Refactron-ai/Refactron_lib

Length of output: 1005


🏁 Script executed:

# Check Python version targets and type annotations in the file
head -20 refactron/analysis/symbol_table.py

Repository: Refactron-ai/Refactron_lib

Length of output: 533


🏁 Script executed:

# Check if TypedDict is already imported or if we need typing_extensions
grep -n "TypedDict\|typing_extensions" refactron/analysis/symbol_table.py

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check Python version targets in project configuration
fd -e pyproject.toml -e setup.py -e setup.cfg | head -3 | xargs grep -l "python_requires\|target-version" 2>/dev/null | head -1 | xargs cat | grep -A5 -B5 "python\|version" | head -30

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Find and read pyproject.toml to check Python version targets
find . -name "pyproject.toml" -type f | head -1 | xargs cat | grep -A10 "python" | head -15

Repository: Refactron-ai/Refactron_lib

Length of output: 679


🏁 Script executed:

# Also verify the complete context of the problematic lines one more time with broader view
sed -n '176,195p' refactron/analysis/symbol_table.py

Repository: Refactron-ai/Refactron_lib

Length of output: 775


Tighten file_metadata typing to resolve mypy no-any-return errors.

Dict[str, Dict[str, Any]] types metadata values as Any, causing comparisons at lines 186 and 189 to return Any instead of bool, violating the function's return type annotation.

Suggested fix
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, TypedDict
+
+
+class FileMetadata(TypedDict, total=False):
+    mtime: float
+    size: int
+    sha256: str
@@
-    file_metadata: Dict[str, Dict[str, Any]] = field(default_factory=dict)
+    file_metadata: Dict[str, FileMetadata] = field(default_factory=dict)
@@
-        metadata = self.symbol_table.file_metadata[file_path_str]
+        metadata: FileMetadata = self.symbol_table.file_metadata[file_path_str]
@@
-            stored_hash = metadata.get("sha256")
-            if stored_hash:
+            stored_hash = metadata.get("sha256")
+            if isinstance(stored_hash, str) and stored_hash:
                 current_hash = self._calculate_hash(file_path)
-                return current_hash != stored_hash
+                return bool(current_hash != stored_hash)
@@
-            return stat.st_mtime != metadata.get("mtime")
+            stored_mtime = metadata.get("mtime")
+            return bool(stored_mtime is None or stat.st_mtime != stored_mtime)

Required to satisfy mypy disallow_untyped_defs in refactron/ per coding guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/analysis/symbol_table.py` around lines 65 - 67, Define a concrete
TypedDict (e.g., FileMetadata with keys "mtime": float, "size": int, "sha256":
str) and replace the loose type Dict[str, Dict[str, Any]] on file_metadata with
Dict[str, FileMetadata]; update any imports (from typing import TypedDict) and
usages so comparisons that read file_metadata[...] .get("mtime"/"size"/"sha256")
yield the correct types (float/int/str) instead of Any, ensuring functions in
symbol_table.py that compare metadata values (references to the file_metadata
variable) now return proper bools for mypy.

@staticmethod
def _normalize_path(path: str) -> str:
"""Standardize path format for consistent keys/storage."""
return Path(path).resolve().as_posix()

def add_symbol(self, symbol: Symbol) -> None:
"""Add a symbol to the table."""
if symbol.file_path not in self.symbols:
self.symbols[symbol.file_path] = {}
path = self._normalize_path(symbol.file_path)
# Ensure the symbol itself stores the normalized path
symbol.file_path = path

if path not in self.symbols:
self.symbols[path] = {}

if symbol.scope not in self.symbols[symbol.file_path]:
self.symbols[symbol.file_path][symbol.scope] = {}
if symbol.scope not in self.symbols[path]:
self.symbols[path][symbol.scope] = {}

self.symbols[symbol.file_path][symbol.scope][symbol.name] = symbol
self.symbols[path][symbol.scope][symbol.name] = symbol

# Track global exports (top-level functions and classes)
if symbol.scope == "global" and symbol.type in (
SymbolType.CLASS,
SymbolType.FUNCTION,
SymbolType.VARIABLE,
):
# Key by module path + name? Or just name for now?
# Using simple name collision strategy for MVP
self.exports[symbol.name] = symbol

def remove_file(self, file_path: str) -> None:
"""Remove all symbols and metadata associated with a file."""
norm_path = self._normalize_path(file_path)

if norm_path in self.symbols:
del self.symbols[norm_path]

# Remove from exports
names_to_remove = [
name
for name, sym in self.exports.items()
if self._normalize_path(sym.file_path) == norm_path
]
for name in names_to_remove:
self.exports.pop(name, None)

if norm_path in self.file_metadata:
del self.file_metadata[norm_path]

def get_symbol(self, file_path: str, name: str, scope: str = "global") -> Optional[Symbol]:
"""Retrieve a symbol."""
return self.symbols.get(file_path, {}).get(scope, {}).get(name)
norm_path = self._normalize_path(file_path)
return self.symbols.get(norm_path, {}).get(scope, {}).get(name)

def resolve_reference(
self, name: str, current_file: str, current_scope: str
Expand All @@ -106,8 +136,7 @@ def resolve_reference(
if file_global:
return file_global

# 3. Cross-file exports (Naive implementation)
# TODO: Enhance this with proper import resolution
# 3. Cross-file exports
return self.exports.get(name)


Expand All @@ -120,42 +149,94 @@ def __init__(self, cache_dir: Optional[Path] = None):
self.inference_engine = InferenceEngine()

def build_for_project(self, project_root: Path) -> SymbolTable:
"""Scan project and build symbol table."""
"""Scan project and build symbol table incrementally."""
if self.cache_dir:
cached = self._load_cache()
if cached:
# TODO: Implement incremental update logic here
return cached
cached_table = self._load_cache()
if cached_table:
self.symbol_table = cached_table

python_files = list(project_root.rglob("*.py"))
current_file_paths = {fp.resolve().as_posix() for fp in python_files}

# 1. Remove deleted files
cached_files = list(self.symbol_table.file_metadata.keys())
for cached_path in cached_files:
if cached_path not in current_file_paths:
logger.debug(f"Removing deleted file from symbol table: {cached_path}")
self.symbol_table.remove_file(cached_path)

Comment on lines +161 to +167
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deleted-file cleanup misses stale symbols when old cache lacks file_metadata.

Line 151 derives cached_files only from self.symbol_table.file_metadata. If an older cache has symbols but no
metadata, deleted files are never purged and stale exports can persist.

Suggested fix
-        cached_files = list(self.symbol_table.file_metadata.keys())
+        cached_files = set(self.symbol_table.file_metadata.keys()) | set(
+            self.symbol_table.symbols.keys()
+        )
         for cached_path in cached_files:
             if cached_path not in current_file_paths:
                 logger.debug(f"Removing deleted file from symbol table: {cached_path}")
                 self.symbol_table.remove_file(cached_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/analysis/symbol_table.py` around lines 150 - 156, The deletion
cleanup currently uses only self.symbol_table.file_metadata to build
cached_files, so when an older cache has symbols but no file_metadata deleted
files aren't found; update the logic in the cleanup loop to compute cached_files
by first using self.symbol_table.file_metadata if present and non-empty,
otherwise fall back to deriving file paths from other symbol stores on the
symbol_table (e.g., keys from self.symbol_table.exported_symbols or any
self.symbol_table.files / self.symbol_table.symbols mapping), then proceed to
call self.symbol_table.remove_file(cached_path) for paths not in
current_file_paths; ensure you reference the existing symbols (exported_symbols,
remove_file, file_metadata) to locate the right structures.

# 2. Analyze new or modified files
for file_path in python_files:
self._analyze_file(file_path)
abs_path = file_path.resolve()
path_str = abs_path.as_posix()
if self._has_file_changed(abs_path, path_str):
logger.debug(f"Analyzing changed file: {path_str}")
self.symbol_table.remove_file(path_str)
self._analyze_file(abs_path)
self._update_file_metadata(abs_path, path_str)

if self.cache_dir:
self._save_cache()

return self.symbol_table

def _analyze_file(self, file_path: Path) -> None:
"""Analyze a single file and populate symbols."""
def _has_file_changed(self, file_path: Path, file_path_str: str) -> bool:
"""Check if file has changed since last analysis."""
if file_path_str not in self.symbol_table.file_metadata:
return True

metadata = self.symbol_table.file_metadata[file_path_str]
try:
stat = file_path.stat()
if stat.st_size != metadata.get("size"):
return True

# Authoritative check: compare SHA-256 hashes
stored_hash = metadata.get("sha256")
if stored_hash:
current_hash = self._calculate_hash(file_path)
return current_hash != stored_hash

return stat.st_mtime != metadata.get("mtime")
except Exception:
return True

def _calculate_hash(self, file_path: Path) -> str:
"""Calculate SHA-256 hash of file content."""
try:
# We use astroid for better inference capabilities later
tree = self.inference_engine.parse_file(str(file_path))
return hashlib.sha256(file_path.read_bytes()).hexdigest()
except Exception:
return ""
Comment on lines +204 to +209
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not persist empty SHA-256 values; it weakens rollback-safe invalidation.

If _calculate_hash() fails, it returns "", and that value is stored in metadata. On later runs this disables the
hash path and falls back to mtime, which can miss changes after timestamp rollback.

Suggested fix
-def _calculate_hash(self, file_path: Path) -> str:
+def _calculate_hash(self, file_path: Path) -> Optional[str]:
@@
-        except Exception:
-            return ""
+        except OSError as e:
+            logger.warning(f"Failed to hash {file_path}: {e}")
+            return None
@@
-            self.symbol_table.file_metadata[file_path_str] = {
+            file_hash = self._calculate_hash(file_path)
+            if not file_hash:
+                self.symbol_table.file_metadata.pop(file_path_str, None)
+                return
+            self.symbol_table.file_metadata[file_path_str] = {
                 "mtime": stat.st_mtime,
                 "size": stat.st_size,
-                "sha256": self._calculate_hash(file_path),
+                "sha256": file_hash,
             }

Also applies to: 206-210

🧰 Tools
🪛 Ruff (0.15.9)

[warning] 199-199: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/analysis/symbol_table.py` around lines 195 - 200, The
_calculate_hash function currently returns an empty string on exception which
gets persisted and weakens rollback-safe invalidation; change its signature to
return Optional[str] (or None on failure) instead of "", log or capture the
underlying exception, and ensure callers (the code that stores metadata) treat
None as "no hash available" and skip persisting any hash value; apply the same
change to the other similar try/except block referenced (lines 206-210) so no
empty string hash is ever written to metadata.


# Walk the tree
self._visit_node(tree, str(file_path), "global")
def _update_file_metadata(self, file_path: Path, path_str: str) -> None:
"""Update file metadata in symbol table."""
try:
stat = file_path.stat()
self.symbol_table.file_metadata[path_str] = {
"mtime": stat.st_mtime,
"size": stat.st_size,
"sha256": self._calculate_hash(file_path),
}
except Exception as e:
logger.warning(f"Failed to update metadata for {path_str}: {e}")

def _analyze_file(self, file_path: Path) -> None:
"""Analyze a single file and populate symbols."""
path_str = file_path.resolve().as_posix()
try:
tree = self.inference_engine.parse_file(path_str)
self._visit_node(tree, path_str, "global")
except Exception as e:
logger.warning(f"Failed to build symbol table for {file_path}: {e}")
logger.warning(f"Failed to build symbol table for {path_str}: {e}")

def _visit_node(self, node: Any, file_path: str, scope: str) -> None:
"""Recursive node visitor."""
import astroid.nodes as nodes

new_scope = scope

if isinstance(node, (nodes.ClassDef, nodes.FunctionDef)):
# Register the definition itself in the CURRENT scope
# Recognize both FunctionDef and AsyncFunctionDef
if isinstance(node, (nodes.ClassDef, nodes.FunctionDef, nodes.AsyncFunctionDef)):
symbol_type = (
SymbolType.CLASS if isinstance(node, nodes.ClassDef) else SymbolType.FUNCTION
)
Expand Down Expand Up @@ -192,9 +273,8 @@ def _visit_node(self, node: Any, file_path: str, scope: str) -> None:
self.symbol_table.add_symbol(symbol)

# Recurse children
if hasattr(node, "get_children"):
for child in node.get_children():
self._visit_node(child, file_path, new_scope)
for child in node.get_children():
self._visit_node(child, file_path, new_scope)

def _save_cache(self) -> None:
"""Save symbol table to cache."""
Expand All @@ -214,6 +294,7 @@ def _save_cache(self) -> None:
for f, scopes in self.symbol_table.symbols.items()
},
"exports": {n: sym.to_dict() for n, sym in self.symbol_table.exports.items()},
"file_metadata": self.symbol_table.file_metadata,
}

with open(cache_file, "w") as f:
Expand All @@ -238,15 +319,27 @@ def _load_cache(self) -> Optional[SymbolTable]:

# Reconstruct symbols
for f_path, scopes in data.get("symbols", {}).items():
table.symbols[f_path] = {}
# Normalize path on load just in case
norm_f_path = SymbolTable._normalize_path(f_path)
table.symbols[norm_f_path] = {}
for scope_name, names in scopes.items():
table.symbols[f_path][scope_name] = {}
table.symbols[norm_f_path][scope_name] = {}
for name, sym_data in names.items():
table.symbols[f_path][scope_name][name] = Symbol.from_dict(sym_data)
sym = Symbol.from_dict(sym_data)
sym.file_path = norm_f_path
table.symbols[norm_f_path][scope_name][name] = sym

# Reconstruct exports
for name, sym_data in data.get("exports", {}).items():
table.exports[name] = Symbol.from_dict(sym_data)
sym = Symbol.from_dict(sym_data)
sym.file_path = SymbolTable._normalize_path(sym.file_path)
table.exports[name] = sym

# Reconstruct metadata
file_metadata = data.get("file_metadata", {})
table.file_metadata = {
SymbolTable._normalize_path(k): v for k, v in file_metadata.items()
}

return table

Expand Down
67 changes: 63 additions & 4 deletions refactron/core/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
Provides capabilities to infer types, values, and resolve symbols.
"""

import os
from pathlib import Path
from typing import Any, List, Optional

import astroid
Expand All @@ -28,10 +30,67 @@ def parse_string(code: str, module_name: str = "") -> nodes.Module:
@staticmethod
def parse_file(file_path: str) -> nodes.Module:
"""Parse a file into an astroid node tree."""
builder = astroid.builder.AstroidBuilder(astroid.MANAGER)
with open(file_path, "r", encoding="utf-8") as f:
code = f.read()
return builder.string_build(code, modname=file_path)
# Use canonical path (resolved and posix-style for consistency)
abs_path = Path(file_path).resolve().as_posix()
manager = astroid.MANAGER

# Aggressively clear cache for this file to ensure fresh AST
# Try both resolved and absolute paths to handle symlinks and normalization differences
raw_abs = os.path.abspath(file_path)
manager.astroid_cache.pop(abs_path, None)
manager.astroid_cache.pop(raw_abs, None)
manager.astroid_cache.pop(file_path, None)

# 2. Find and clear by module name if it exists in caches
file_to_mod = getattr(manager, "file_to_module_cache", {})
# Some versions use _mod_file_cache
if not file_to_mod:
file_to_mod = getattr(manager, "_mod_file_cache", {})

modname = (
file_to_mod.get(abs_path) or file_to_mod.get(raw_abs) or file_to_mod.get(file_path)
)
if modname:
manager.astroid_cache.pop(modname, None)

# 3. Exhaustive search in astroid_cache for any module pointing to this file
for key, val in list(manager.astroid_cache.items()):
if hasattr(val, "file") and val.file:
val_path = Path(val.file).resolve().as_posix()
if val_path == abs_path or val_path == raw_abs.replace("\\", "/"):
manager.astroid_cache.pop(key, None)

# 4. Clear the mappings themselves
for attr in ("file_to_module_cache", "_mod_file_cache"):
cache = getattr(manager, attr, None)
if isinstance(cache, dict):
cache.pop(abs_path, None)
cache.pop(raw_abs, None)
cache.pop(file_path, None)

# 5. Read file and parse directly to bypass astroid's file cache
try:
with open(abs_path, "r", encoding="utf-8") as f:
code = f.read()

# Resolve module name to keep astroid's state consistent
modname = ""
try:
from astroid import modutils

modname = modutils.modname_from_path(abs_path)
except Exception:
pass

# Use string_build via parse to avoid manager.ast_from_file's internal caching
return astroid.parse(code, module_name=modname, path=abs_path)
Comment on lines +77 to +86
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "inference.py" | head -20

Repository: Refactron-ai/Refactron_lib

Length of output: 99


🏁 Script executed:

# Look for refactron directory structure
fd -type f "inference\.py" | grep refactron

Repository: Refactron-ai/Refactron_lib

Length of output: 240


🏁 Script executed:

# Get the file location and content
ls -la refactron/core/inference.py 2>/dev/null && wc -l refactron/core/inference.py

Repository: Refactron-ai/Refactron_lib

Length of output: 178


🏁 Script executed:

cat -n refactron/core/inference.py | sed -n '65,100p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1757


🏁 Script executed:

# Also check the full function signature and any related context
cat -n refactron/core/inference.py | sed -n '1,90p'

Repository: Refactron-ai/Refactron_lib

Length of output: 4199


🏁 Script executed:

# Check if BackupManager is imported or used in this file
grep -n "BackupManager\|validate_backup_integrity" refactron/core/inference.py

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check the broader context - what calls parse_file and expects?
rg -n "parse_file" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 2087


🏁 Script executed:

# Look at modutils to understand what modname_from_path returns
rg -A 5 "modutils.modname_from_path" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 518


🏁 Script executed:

# Check how symbol_table.py uses the result from parse_file()
cat -n refactron/analysis/symbol_table.py | sed -n '240,270p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1510


🏁 Script executed:

# Look at the test cases to understand expected behavior
cat -n tests/test_rag_parser.py | sed -n '84,110p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1352


🏁 Script executed:

# Check if there are any files with relative imports that would be problematic
rg "from \.\." --type py | head -20

Repository: Refactron-ai/Refactron_lib

Length of output: 208


🏁 Script executed:

# Search for any relative import issues, bugs, or TODOs related to module_name
rg "relative import|package context" --type py -i

Repository: Refactron-ai/Refactron_lib

Length of output: 694


🏁 Script executed:

# Check for any test failures or issues related to module_name or modutils
rg "modname|module_name" refactron/core/inference.py -B 2 -A 2

Repository: Refactron-ai/Refactron_lib

Length of output: 1297


🏁 Script executed:

# Look for git history or comments about this change
git log --oneline -20 -- refactron/core/inference.py 2>/dev/null || echo "Git history unavailable"

Repository: Refactron-ai/Refactron_lib

Length of output: 176


🏁 Script executed:

# Check if there are any conftest or fixture files that test relative imports
rg "from \.\." tests/ --type py | head -5

Repository: Refactron-ai/Refactron_lib

Length of output: 208


🏁 Script executed:

# Look at the test for relative import detection
cat -n tests/test_analyzer_coverage_supplement.py | grep -A 10 "Test relative import"

Repository: Refactron-ai/Refactron_lib

Length of output: 414


🏁 Script executed:

# Check what dependency_analyzer does with relative imports
cat -n refactron/analyzers/dependency_analyzer.py | sed -n '1,50p'

Repository: Refactron-ai/Refactron_lib

Length of output: 2084


🏁 Script executed:

# Search for any error logs or issues with module inference
rg "Failed to.*infer|infer.*failed" --type py -i

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check if there's documentation about astroid's parse function behavior
rg "astroid.parse" refactron/ --type py -B 2 -A 2

Repository: Refactron-ai/Refactron_lib

Length of output: 859


🏁 Script executed:

# Check if there are actual test cases that would fail with empty module_name
rg "from \.\." tests/ --type py -B 2 -A 2

Repository: Refactron-ai/Refactron_lib

Length of output: 435


🏁 Script executed:

# Look at the actual test for relative import detection more closely
cat -n tests/test_analyzer_coverage_supplement.py | sed -n '195,220p'

Repository: Refactron-ai/Refactron_lib

Length of output: 965


🏁 Script executed:

# Check if symbol_table.py actually needs to resolve relative imports
rg "infer\|relative import\|package" refactron/analysis/symbol_table.py --type py | head -20

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Look for any actual issues in tests related to module resolution
rg "modname|module_name" tests/ --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check if modutils.modname_from_path can actually fail, and in what scenarios
python3 << 'EOF'
try:
    from astroid import modutils
    # Test with a random path
    result = modutils.modname_from_path("/some/random/path/not/in/sys/path.py")
    print(f"Result for path not in sys.path: {result}")
except Exception as e:
    print(f"Exception type: {type(e).__name__}, message: {e}")
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 173


🏁 Script executed:

# Check the actual astroid documentation or source comments about module_name and path
python3 << 'EOF'
import inspect
import astroid
# Get the signature and docstring of astroid.parse
sig = inspect.signature(astroid.parse)
print("astroid.parse signature:")
print(sig)
print("\nDocstring:")
print(astroid.parse.__doc__)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 599


🏁 Script executed:

# Look for any actual reported issues or TODOs in the code about this
rg "TODO|FIXME|BUG|XXX" refactron/core/inference.py --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check if there are any test failures or skip markers related to this
rg "@pytest.mark.skip|@pytest.mark.xfail|@unittest.skip" tests/ --type py | grep -i "import\|module\|inference"

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check what version of astroid is being used and what functions are available
python3 << 'EOF'
import astroid
print(f"Astroid version: {astroid.__version__}")
print(f"modutils attributes:")
from astroid import modutils
print([attr for attr in dir(modutils) if 'modname' in attr.lower() or 'path' in attr.lower()])
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 486


🏁 Script executed:

# Check if there's a requirements file or setup.py that specifies astroid version
cat setup.py pyproject.toml requirements.txt 2>/dev/null | grep -A 2 -B 2 "astroid" | head -20

Repository: Refactron-ai/Refactron_lib

Length of output: 228


🏁 Script executed:

# Look at what the code is actually importing from modutils
rg "from astroid import modutils|from astroid.modutils import" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 142


🏁 Script executed:

# Check if there's any version compatibility handling in the codebase
rg "astroid.__version__|astroid.version" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 195


🏁 Script executed:

# Verify the actual modpath_from_file function and how it works
python3 << 'EOF'
from astroid import modutils
import inspect
print("modpath_from_file signature:")
print(inspect.signature(modutils.modpath_from_file))
print("\nDocstring:")
print(modutils.modpath_from_file.__doc__)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 715


🏁 Script executed:

# Test modpath_from_file to see what it returns
python3 << 'EOF'
from astroid import modutils
try:
    result = modutils.modpath_from_file("/some/random/path.py")
    print(f"modpath_from_file result: {result}")
except Exception as e:
    print(f"Exception: {type(e).__name__}: {e}")

# Try with an actual file
import tempfile
import os
with tempfile.NamedTemporaryFile(suffix=".py", delete=False) as f:
    temp_path = f.name
try:
    result = modutils.modpath_from_file(temp_path)
    print(f"modpath_from_file for temp file: {result}")
except Exception as e:
    print(f"Exception for temp file: {type(e).__name__}: {e}")
finally:
    os.unlink(temp_path)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 544


🏁 Script executed:

# Check if modname_from_path was available in older astroid versions
git log -p --all -S "modname_from_path" -- "*.py" | head -50 2>/dev/null || echo "Not found in git history"

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check the actual error handling in parse_file - what exceptions are expected?
cat -n refactron/core/inference.py | sed -n '71,95p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1226


🏁 Script executed:

# Test if astroid can resolve relative imports with empty module_name but valid path
python3 << 'EOF'
import astroid
import tempfile
import os
from pathlib import Path

# Create a test package structure
temp_dir = tempfile.mkdtemp()
pkg_dir = os.path.join(temp_dir, "mypackage")
os.makedirs(pkg_dir)

# Create __init__.py
with open(os.path.join(pkg_dir, "__init__.py"), "w") as f:
    f.write("# init\n")

# Create utils.py
with open(os.path.join(pkg_dir, "utils.py"), "w") as f:
    f.write("def helper(): return 42\n")

# Create test.py with relative import
test_file = os.path.join(pkg_dir, "test.py")
with open(test_file, "w") as f:
    f.write("""from . import utils
def main():
    return utils.helper()
""")

# Test 1: Parse with empty module_name
print("Test 1: Parse with empty module_name and path")
with open(test_file) as f:
    code = f.read()
try:
    module1 = astroid.parse(code, module_name="", path=test_file)
    print(f"  Success. Module name: '{module1.name}'")
    # Try to infer the import
    for node in module1.nodes_of_class(astroid.ImportFrom):
        print(f"  Found ImportFrom: {node.as_string()}")
except Exception as e:
    print(f"  Error: {e}")

# Test 2: Parse with correct module_name
print("\nTest 2: Parse with correct module_name='mypackage.test' and path")
try:
    module2 = astroid.parse(code, module_name="mypackage.test", path=test_file)
    print(f"  Success. Module name: '{module2.name}'")
    for node in module2.nodes_of_class(astroid.ImportFrom):
        print(f"  Found ImportFrom: {node.as_string()}")
except Exception as e:
    print(f"  Error: {e}")

# Cleanup
import shutil
shutil.rmtree(temp_dir)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 646


🏁 Script executed:

# Check what the correct way to get module name from file path should be
python3 << 'EOF'
from astroid import modutils
import os

# Show the correct usage of modpath_from_file
print("Correct function: modpath_from_file()")
print("It returns a list of module path components")
print("\nExample of how to convert modpath to modname:")
modpath = ["mypackage", "test"]
modname = ".".join(modpath)
print(f"  modpath: {modpath}")
print(f"  modname: {modname}")
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 257


🏁 Script executed:

# Test if inference actually works with empty module_name for relative imports
python3 << 'EOF'
import astroid
import tempfile
import os
from pathlib import Path

# Create a test package structure
temp_dir = tempfile.mkdtemp()
pkg_dir = os.path.join(temp_dir, "mypackage")
os.makedirs(pkg_dir)

# Create __init__.py
with open(os.path.join(pkg_dir, "__init__.py"), "w") as f:
    f.write("# init\n")

# Create utils.py
with open(os.path.join(pkg_dir, "utils.py"), "w") as f:
    f.write("def helper(): return 42\n")

# Create test.py with relative import
test_file = os.path.join(pkg_dir, "test.py")
with open(test_file) as f:
    code = f.read()

# Parse with empty module_name (current behavior)
print("Test: Inference with empty module_name")
module1 = astroid.parse(code, module_name="", path=test_file)
print(f"Module name: '{module1.name}'")

# Try to infer the imported symbol
for node in module1.nodes_of_class(astroid.nodes.ImportFrom):
    print(f"ImportFrom node: level={node.level}, names={node.names}")
    try:
        # Try to infer what 'utils' refers to
        inferred = list(node.infer_name("utils", None))
        print(f"  Inferred 'utils': {inferred}")
    except Exception as e:
        print(f"  Inference failed: {type(e).__name__}: {e}")

print("\n---\n")

# Parse with correct module_name (desired behavior)
print("Test: Inference with correct module_name='mypackage.test'")
module2 = astroid.parse(code, module_name="mypackage.test", path=test_file)
print(f"Module name: '{module2.name}'")

for node in module2.nodes_of_class(astroid.nodes.ImportFrom):
    print(f"ImportFrom node: level={node.level}, names={node.names}")
    try:
        inferred = list(node.infer_name("utils", None))
        print(f"  Inferred 'utils': {inferred}")
    except Exception as e:
        print(f"  Inference failed: {type(e).__name__}: {e}")

# Cleanup
import shutil
shutil.rmtree(temp_dir)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 236


🏁 Script executed:

# Check if symbol_table.py actually needs to infer relative imports
cat -n refactron/analysis/symbol_table.py | sed -n '1,50p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1841


🏁 Script executed:

# Check if there are any actual test failures when running tests
rg "test.*parse_file|test.*inference" tests/ --type py -l

Repository: Refactron-ai/Refactron_lib

Length of output: 94


🏁 Script executed:

# Fix the test - create the file first, then read it
python3 << 'EOF'
import astroid
import tempfile
import os
from pathlib import Path

# Create a test package structure
temp_dir = tempfile.mkdtemp()
pkg_dir = os.path.join(temp_dir, "mypackage")
os.makedirs(pkg_dir)

# Create __init__.py
with open(os.path.join(pkg_dir, "__init__.py"), "w") as f:
    f.write("# init\n")

# Create utils.py
with open(os.path.join(pkg_dir, "utils.py"), "w") as f:
    f.write("def helper(): return 42\n")

# Create test.py with relative import - BEFORE reading it
test_file = os.path.join(pkg_dir, "test.py")
with open(test_file, "w") as f:
    f.write("""from . import utils
def main():
    return utils.helper()
""")

# NOW read it
with open(test_file) as f:
    code = f.read()

# Parse with empty module_name (current behavior)
print("Test: Inference with empty module_name")
module1 = astroid.parse(code, module_name="", path=test_file)
print(f"Module name: '{module1.name}'")

# Try to infer the imported symbol
for node in module1.nodes_of_class(astroid.nodes.ImportFrom):
    print(f"ImportFrom node: level={node.level}, names={node.names}")
    try:
        inferred = list(node.infer_name("utils", None))
        print(f"  Inferred 'utils': {inferred}")
    except Exception as e:
        print(f"  Inference failed: {type(e).__name__}: {e}")

print("\n---\n")

# Parse with correct module_name (desired behavior)
print("Test: Inference with correct module_name='mypackage.test'")
module2 = astroid.parse(code, module_name="mypackage.test", path=test_file)
print(f"Module name: '{module2.name}'")

for node in module2.nodes_of_class(astroid.nodes.ImportFrom):
    print(f"ImportFrom node: level={node.level}, names={node.names}")
    try:
        inferred = list(node.infer_name("utils", None))
        print(f"  Inferred 'utils': {inferred}")
    except Exception as e:
        print(f"  Inference failed: {type(e).__name__}: {e}")

# Cleanup
import shutil
shutil.rmtree(temp_dir)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 490


🏁 Script executed:

# Check what the test_rag_parser tests actually do
cat -n tests/test_rag_parser.py | sed -n '84,120p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1826


🏁 Script executed:

# Look for any test that calls InferenceEngine.parse_file directly
rg "InferenceEngine\|parse_file" tests/ --type py -B 2 -A 5 | head -50

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check if there are any actual test failures by examining test output or fixtures
cat -n tests/test_rag_parser.py | sed -n '1,50p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1516


🏁 Script executed:

# Look at what parse_file returns and how it's used in RAG parser
cat -n refactron/rag/parser.py | sed -n '185,210p'

Repository: Refactron-ai/Refactron_lib

Length of output: 1234


🏁 Script executed:

# The key question: does symbol_table need correct module_name for its use case?
cat -n refactron/analysis/symbol_table.py | sed -n '240,280p'

Repository: Refactron-ai/Refactron_lib

Length of output: 2045


🏁 Script executed:

# Verify if modname_from_path is truly non-existent or if it's from a different import
rg "modname_from_path" . --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 162


🏁 Script executed:

# Check if there's maybe a compatibility layer or if this is just dead code
rg "def modname_from_path" . --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

# Check the git commit history for this specific function
git log --oneline -p refactron/core/inference.py | grep -A 10 -B 5 "modname_from_path" | head -40

Repository: Refactron-ai/Refactron_lib

Length of output: 731


🏁 Script executed:

# Check if maybe there was an older version of astroid that had this function
python3 << 'EOF'
# See if there's any way to resolve module name correctly from a path
from astroid import modutils
import os
import sys

test_path = "/some/nonexistent/path/mymodule.py"
print("Available functions in modutils related to module/path:")
for attr in sorted(dir(modutils)):
    if not attr.startswith('_'):
        print(f"  - {attr}")
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 1044


🏁 Script executed:

# Check if the code even runs - try to import and use it
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
try:
    from refactron.core.inference import InferenceEngine
    print("Import successful")
    # Try to use parse_file
    import tempfile
    with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
        f.write("x = 1\n")
        f.flush()
        try:
            result = InferenceEngine.parse_file(f.name)
            print(f"parse_file succeeded: {result}")
        except Exception as e:
            print(f"parse_file failed: {type(e).__name__}: {e}")
        finally:
            import os
            os.unlink(f.name)
except Exception as e:
    print(f"Import failed: {e}")
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 341


🏁 Script executed:

# Final verification: Check if symbol_table actually needs correct module_name
# by checking what operations it does on the parsed tree
cat -n refactron/analysis/symbol_table.py | sed -n '264,320p'

Repository: Refactron-ai/Refactron_lib

Length of output: 2504


🏁 Script executed:

# Verify the actual impact: does empty module_name cause problems for inference in practice?
python3 << 'EOF'
import astroid
import tempfile
import os

# Create a simple test
temp_dir = tempfile.mkdtemp()
test_file = os.path.join(temp_dir, "test.py")
with open(test_file, "w") as f:
    f.write("""
class MyClass:
    def method(self):
        return 42

def my_function():
    pass
""")

# Parse with empty module_name (current broken behavior)
with open(test_file) as f:
    code = f.read()
    
module = astroid.parse(code, module_name="", path=test_file)
print(f"Module name: '{module.name}'")
print(f"Module file: {module.file}")
print(f"Classes: {[c.name for c in module.nodes_of_class(astroid.nodes.ClassDef)]}")
print(f"Functions: {[f.name for f in module.nodes_of_class(astroid.nodes.FunctionDef)]}")

# Can we visit the nodes like symbol_table does?
for node in module.body:
    if isinstance(node, (astroid.nodes.ClassDef, astroid.nodes.FunctionDef)):
        print(f"Node: {node.name}, line: {node.lineno}")

import shutil
shutil.rmtree(temp_dir)
EOF

Repository: Refactron-ai/Refactron_lib

Length of output: 231


Fix the non-existent function call and properly resolve module names.

The code calls modutils.modname_from_path() which does not exist in astroid 4.0.4 (or any version). This causes an AttributeError that gets silently caught, leaving modname as an empty string. The correct approach is to use modpath_from_file() to derive the module name from the file path.

With empty module_name, astroid parses the code but loses package context. This breaks relative import resolution and package-aware inference for files outside sys.path, undermining the cache-correctness goal.

Instead of silently falling back to empty module name, either:

  1. Use modutils.modpath_from_file() to properly resolve the module path and convert it to a module name (handling ImportError for files outside sys.path), or
  2. Pass a repository-derived module name into parse_file, or
  3. Fall back to manager.ast_from_file() only when module resolution fails.
🧰 Tools
🪛 GitHub Actions: CI

[error] Command failed with exit code 1: flake8 refactron --count --select=E9,F63,F7,F82 --show-source --statistics

🪛 Ruff (0.15.9)

[error] 82-83: try-except-pass detected, consider logging the exception

(S110)


[warning] 82-82: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/core/inference.py` around lines 77 - 86, The code currently calls
the non-existent modutils.modname_from_path and silently falls back to an empty
module_name which loses package context; replace that call with
modutils.modpath_from_file(abs_path) and convert the returned module path to a
dotted module name for astroid.parse (assign to modname) while handling
ImportError/ValueError for files outside sys.path; alternatively, if resolution
fails, either accept a repository-derived module name passed into parse_file or
call manager.ast_from_file(abs_path) as a fallback instead of parsing with an
empty module_name—update the block around modutils usage and the return calling
astroid.parse(code, module_name=modname, path=abs_path) accordingly.

except (OSError, UnicodeDecodeError):
# Fallback to manager if manual read fails
try:
return manager.ast_from_file(abs_path)
except Exception as e:
# Fallback for virtual/non-existent files if needed
raise ValueError(f"Failed to parse {abs_path}: {e}")

@staticmethod
def infer_node(node: nodes.NodeNG, context: Optional[InferenceContext] = None) -> List[Any]:
Expand Down
Loading
Loading