Skip to content

perf: share parsed AST between verification checks#144

Open
shrutu0929 wants to merge 2 commits intoRefactron-ai:mainfrom
shrutu0929:perf/share-parsed-ast
Open

perf: share parsed AST between verification checks#144
shrutu0929 wants to merge 2 commits intoRefactron-ai:mainfrom
shrutu0929:perf/share-parsed-ast

Conversation

@shrutu0929
Copy link
Copy Markdown

@shrutu0929 shrutu0929 commented Apr 3, 2026

solve #140
The recent commit optimizes performance by eliminating redundant text-to-AST parsing operations across multiple independent verification modules. We achieved this by introducing a new, immutable VerificationContext data structure combined with an updated VerificationEngine that parses the original and transformed source code trees exactly once per execution cycle. Instead of relying on individual checks to independently parse raw string inputs, the engine now passes this pre-parsed VerificationContext downstream to all active verifiers, such as the SyntaxVerifier and ImportIntegrityVerifier. These individual modules have been refactored to lazily extract the pre-computed syntax trees from the context directly, while still gracefully maintaining 100% backward compatibility via local fallback parsing should the context be omitted by customized extensions.

Summary by CodeRabbit

  • New Features
    • Introduced a verification framework with multiple validation checks for code transformations. It validates syntax correctness of transformed code and verifies import integrity to ensure module dependencies remain intact. The framework runs sequential checks and ensures all validations pass before confirming transformation success.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

Warning

Rate limit exceeded

@shrutu0929 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 44 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 12 minutes and 44 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5ee7a885-d520-40cc-b3ca-19c08b60cd75

📥 Commits

Reviewing files that changed from the base of the PR and between fb0d008 and d996573.

📒 Files selected for processing (3)
  • refactron/verification/checks/imports.py
  • refactron/verification/checks/syntax.py
  • refactron/verification/engine.py
📝 Walkthrough

Walkthrough

A verification framework is introduced with a VerificationEngine that orchestrates checks across original and transformed code, utilizing shared AST parsing through a VerificationContext to avoid redundant parsing operations.

Changes

Cohort / File(s) Summary
Engine Framework
refactron/verification/engine.py
New module defining core verification infrastructure: VerificationContext dataclass for sharing parsed ASTs, BaseCheck abstract base class with verify() interface, and VerificationEngine orchestrator that parses inputs, constructs context, and sequentially runs checks with fallback parameter handling.
Check Implementations
refactron/verification/checks/syntax.py, refactron/verification/checks/imports.py
Two new check classes: SyntaxVerifier validates transformed code syntax using shared AST context; ImportIntegrityVerifier extracts and compares imports from both ASTs (comparison logic currently unimplemented).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Engine as VerificationEngine
    participant Context as VerificationContext
    participant Check1 as SyntaxVerifier
    participant Check2 as ImportIntegrityVerifier

    Client->>Engine: verify(original, transformed)
    Engine->>Engine: Parse original with ast.parse()
    Engine->>Engine: Parse transformed with ast.parse()
    Engine->>Context: Create with code + ASTs
    
    Engine->>Check1: verify(original, transformed, context)
    Check1->>Check1: Check transformed AST validity
    Check1-->>Engine: Return validation result
    
    Engine->>Check2: verify(original, transformed, context)
    Check2->>Check2: Extract imports from both ASTs
    Check2->>Check2: Compare import integrity
    Check2-->>Engine: Return validation result
    
    Engine-->>Client: Return True (all passed) or False
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 Through the verification warren they leap,
Where syntax checks and imports creep,
A shared context guards the way,
One parse per run, hooray, hooray!
The Engine guides them, check by check,
No duplicates—a perfect deck! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf: share parsed AST between verification checks' directly and concisely captures the main change: a performance optimization that reuses parsed AST across multiple verification checks by using a shared context.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
refactron/verification/engine.py (2)

55-56: Consider early exit for fail-fast behavior.

The loop continues checking all verifiers even after one fails. If verification failures are expected to be rare and checks are expensive, adding break after setting all_passed = False would improve performance. However, if you want to collect all failures for reporting, the current approach is correct.

♻️ Optional: Add early exit
             if not passed:
                 all_passed = False
+                break  # Fail fast - no need to run remaining checks
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/verification/engine.py` around lines 55 - 56, The loop that
iterates over verifiers (the code using the variables passed and all_passed)
continues checking after a failure; to implement fail-fast behavior, add a break
immediately after setting all_passed = False inside that loop so execution exits
on first failed verifier; optionally, make this conditional by introducing a
fail_fast flag/parameter to the surrounding function or method (the one that
runs the verifier loop) and only break when fail_fast is true.

47-56: TypeError fallback may mask unrelated errors.

The try-except TypeError block on lines 48-53 catches any TypeError, not just those from signature mismatches. If a check's verify() implementation has an internal TypeError bug, it will be silently retried without context, potentially producing incorrect results or masking the real error.

Consider using inspect.signature to check parameter support upfront, or catch more specifically:

♻️ Proposed fix using signature inspection
+import inspect
+
 class VerificationEngine:
     def __init__(self, checks: List[BaseCheck]):
         self.checks = checks
+        # Pre-compute which checks support context
+        self._check_supports_context = {
+            id(check): 'context' in inspect.signature(check.verify).parameters
+            for check in checks
+        }
         
     def verify(self, original: str, transformed: str) -> bool:
         # ... AST parsing ...
         
         all_passed = True
         for check in self.checks:
-            try:
-                # Attempt to pass context for optimized routines
+            if self._check_supports_context.get(id(check), False):
                 passed = check.verify(original, transformed, context=context)
-            except TypeError:
-                # Fallback for older checks that don't accept context
+            else:
                 passed = check.verify(original, transformed)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/verification/engine.py` around lines 47 - 56, The current
try/except around calling check.verify(original, transformed, context=context)
can hide unrelated TypeError; instead inspect the verify() signature before
invoking: use inspect.signature(check.verify) (or getattr to the underlying
function for bound methods) to detect whether a "context" parameter is accepted
and call check.verify with context only when present, otherwise call without it;
update the loop that iterates self.checks (the code invoking check.verify) to
perform this signature check per-check so no broad TypeError is caught and
internal verify errors surface normally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@refactron/verification/checks/imports.py`:
- Around line 43-51: The import-comparison block currently does nothing;
implement logic in the ImportIntegrityVerifier method that contains orig_imports
and trans_imports (using the already extracted lists from _extract_imports) to
compare them and fail when imports present in orig_imports are missing from
trans_imports. Specifically, compute the set difference between orig_imports and
trans_imports (or otherwise normalize/import-name forms) and if any originals
are missing, return False (or raise the verifier's expected VerificationError)
with a clear message listing the missing imports; otherwise return True. Ensure
you reference and use the existing variables orig_imports and trans_imports and
keep behavior compatible with the surrounding VerificationEngine expectations.
- Around line 31-41: The try-except-pass around the fallback ast.parse calls for
orig_ast and trans_ast in the imports check silently swallows errors; update the
logic in the function containing orig_ast/trans_ast so that if
ast.parse(original) and ast.parse(transformed) both raise (i.e., both orig_ast
and trans_ast remain None) you return False (or at minimum log both exceptions
via the module logger) instead of continuing and returning True; capture the
exceptions from ast.parse and attach them to the log message (or store them
locally) so you can include them in the error log when deciding to return False.

In `@refactron/verification/engine.py`:
- Around line 12-18: Move the three verifier implementations SyntaxVerifier,
ImportIntegrityVerifier, and TestSuiteGate into verification/engine.py and
remove their originals from checks/ so all BaseCheck implementations live
alongside BaseCheck; update TestSuiteGate.verify to match BaseCheck.verify by
accepting (original: str, transformed: str, context:
Optional[VerificationContext] = None) and returning bool (convert any existing
CheckResult logic into a boolean success/failure), use the supplied
VerificationContext to obtain file_path or other metadata instead of a direct
Path parameter, and adjust any imports/usages of CheckResult to expect a bool
(or translate where they call TestSuiteGate.verify) so signatures are consistent
with BaseCheck.verify.
- Around line 20-24: Remove the duplicate, incomplete definitions of BaseCheck
and VerificationEngine and the stray module docstring so only the correct
implementations remain: delete the old BaseCheck class and the earlier
VerificationEngine that defines __init__(self, checks) and verify(self,
original, transformed) and remove any orphaned docstring and redundant imports
they introduced; ensure the remaining VerificationEngine uses the intended
constructor signature (__init__(project_root, checks)) and verify(original,
transformed, file_path) behavior and still instantiates TestSuiteGate,
SyntaxVerifier, and ImportIntegrityVerifier as in the correct implementation.

---

Nitpick comments:
In `@refactron/verification/engine.py`:
- Around line 55-56: The loop that iterates over verifiers (the code using the
variables passed and all_passed) continues checking after a failure; to
implement fail-fast behavior, add a break immediately after setting all_passed =
False inside that loop so execution exits on first failed verifier; optionally,
make this conditional by introducing a fail_fast flag/parameter to the
surrounding function or method (the one that runs the verifier loop) and only
break when fail_fast is true.
- Around line 47-56: The current try/except around calling
check.verify(original, transformed, context=context) can hide unrelated
TypeError; instead inspect the verify() signature before invoking: use
inspect.signature(check.verify) (or getattr to the underlying function for bound
methods) to detect whether a "context" parameter is accepted and call
check.verify with context only when present, otherwise call without it; update
the loop that iterates self.checks (the code invoking check.verify) to perform
this signature check per-check so no broad TypeError is caught and internal
verify errors surface normally.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 879e8231-cb7b-4b95-8d09-3e1249a5f0f7

📥 Commits

Reviewing files that changed from the base of the PR and between c89cfdb and fb0d008.

📒 Files selected for processing (3)
  • refactron/verification/checks/imports.py
  • refactron/verification/checks/syntax.py
  • refactron/verification/engine.py

Comment on lines +31 to +41
if orig_ast is None:
try:
orig_ast = ast.parse(original)
except Exception:
pass
if trans_ast is None:
try:
trans_ast = ast.parse(transformed)
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent exception swallowing hides parsing failures.

The try-except-pass blocks silently discard parsing errors. If both ASTs fail to parse (and no context was provided), the method skips to line 51 and returns True, incorrectly indicating valid imports for unparseable code.

Consider returning False when both fallback parses fail, or at least logging the exception for debuggability.

🛡️ Proposed fix to handle fallback parse failures
         # Fallback for backward compatibility
         if orig_ast is None:
             try:
                 orig_ast = ast.parse(original)
-            except Exception:
-                pass
+            except SyntaxError:
+                return False  # Can't verify imports without valid AST
                 
         if trans_ast is None:
             try:
                 trans_ast = ast.parse(transformed)
-            except Exception:
-                pass
+            except SyntaxError:
+                return False  # Can't verify imports without valid AST
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if orig_ast is None:
try:
orig_ast = ast.parse(original)
except Exception:
pass
if trans_ast is None:
try:
trans_ast = ast.parse(transformed)
except Exception:
pass
if orig_ast is None:
try:
orig_ast = ast.parse(original)
except SyntaxError:
return False # Can't verify imports without valid AST
if trans_ast is None:
try:
trans_ast = ast.parse(transformed)
except SyntaxError:
return False # Can't verify imports without valid AST
🧰 Tools
🪛 Ruff (0.15.7)

[error] 34-35: try-except-pass detected, consider logging the exception

(S110)


[warning] 34-34: Do not catch blind exception: Exception

(BLE001)


[error] 40-41: try-except-pass detected, consider logging the exception

(S110)


[warning] 40-40: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/verification/checks/imports.py` around lines 31 - 41, The
try-except-pass around the fallback ast.parse calls for orig_ast and trans_ast
in the imports check silently swallows errors; update the logic in the function
containing orig_ast/trans_ast so that if ast.parse(original) and
ast.parse(transformed) both raise (i.e., both orig_ast and trans_ast remain
None) you return False (or at minimum log both exceptions via the module logger)
instead of continuing and returning True; capture the exceptions from ast.parse
and attach them to the log message (or store them locally) so you can include
them in the error log when deciding to return False.

Comment on lines +12 to +18
class BaseCheck:
def verify(self, original: str, transformed: str, context: Optional[VerificationContext] = None) -> bool:
"""
Verify the transformation.
Backward compatible Signature allows checks to drop the context argument if unsupported by custom checks.
"""
raise NotImplementedError
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if TestSuiteGate exists elsewhere in the codebase
rg -n "class TestSuiteGate" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 147


🏁 Script executed:

# Check what's in engine.py
cat -n refactron/verification/engine.py | head -50

Repository: Refactron-ai/Refactron_lib

Length of output: 2023


🏁 Script executed:

# Check the TestSuiteGate implementation
cat -n refactron/verification/checks/test_gate.py | head -30

Repository: Refactron-ai/Refactron_lib

Length of output: 1207


🏁 Script executed:

# Search for all three implementations
rg -n "class (SyntaxVerifier|ImportIntegrityVerifier|TestSuiteGate)" --type py

Repository: Refactron-ai/Refactron_lib

Length of output: 469


All three BaseCheck implementations should be in engine.py, not scattered in checks/ subdirectory.

TestSuiteGate exists in refactron/verification/checks/test_gate.py, but per project guidelines, all three implementations (SyntaxVerifier, ImportIntegrityVerifier, TestSuiteGate) must be consolidated in verification/engine.py. Additionally, TestSuiteGate.verify() has an incompatible signature: it uses file_path: Path and returns CheckResult, while BaseCheck.verify() expects context: Optional[VerificationContext] and returns bool.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/verification/engine.py` around lines 12 - 18, Move the three
verifier implementations SyntaxVerifier, ImportIntegrityVerifier, and
TestSuiteGate into verification/engine.py and remove their originals from
checks/ so all BaseCheck implementations live alongside BaseCheck; update
TestSuiteGate.verify to match BaseCheck.verify by accepting (original: str,
transformed: str, context: Optional[VerificationContext] = None) and returning
bool (convert any existing CheckResult logic into a boolean success/failure),
use the supplied VerificationContext to obtain file_path or other metadata
instead of a direct Path parameter, and adjust any imports/usages of CheckResult
to expect a bool (or translate where they call TestSuiteGate.verify) so
signatures are consistent with BaseCheck.verify.

Comment on lines 20 to +24
class VerificationEngine:
"""Orchestrates verification checks in a short-circuit pipeline."""

def __init__(
self,
project_root: Optional[Path] = None,
checks: Optional[List[BaseCheck]] = None,
):
self.project_root = project_root
self.checks: List[BaseCheck]
if checks is not None:
self.checks = checks
else:
from refactron.verification.checks.imports import ImportIntegrityVerifier
from refactron.verification.checks.syntax import SyntaxVerifier
from refactron.verification.checks.test_gate import TestSuiteGate

self.checks = [
SyntaxVerifier(),
ImportIntegrityVerifier(),
TestSuiteGate(project_root=project_root),
]

def verify(self, original: str, transformed: str, file_path: Path) -> VerificationResult:
"""Run all checks in order, short-circuiting on first failure."""
start = time.monotonic()
check_results: List[CheckResult] = []
checks_run: List[str] = []
checks_passed: List[str] = []
checks_failed: List[str] = []
skipped_checks: List[tuple] = []
blocking_reason: Optional[str] = None
passed = True

for i, check in enumerate(self.checks):
try:
cr = check.verify(original, transformed, file_path)
except Exception as exc:
cr = CheckResult(
check_name=check.name,
passed=False,
blocking_reason=f"Check raised exception: {exc}",
confidence=0.0,
duration_ms=0,
details={"exception": str(exc)},
)
passed = False

check_results.append(cr)
checks_run.append(check.name)

if cr.passed:
checks_passed.append(check.name)
else:
checks_failed.append(check.name)
if blocking_reason is None:
blocking_reason = cr.blocking_reason
# Short-circuit: skip remaining checks
for remaining in self.checks[i + 1 :]:
skipped_checks.append(
(remaining.name, f"Short-circuited after {check.name} failed")
)
break

elapsed_ms = int((time.monotonic() - start) * 1000)
confidence = self._compute_confidence(check_results)
safe = passed and len(checks_failed) == 0

return VerificationResult(
safe_to_apply=safe,
passed=passed,
checks_run=checks_run,
checks_passed=checks_passed,
checks_failed=checks_failed,
skipped_checks=skipped_checks,
blocking_reason=blocking_reason,
confidence_score=confidence,
verification_ms=elapsed_ms,
check_results=check_results,
def __init__(self, checks: List[BaseCheck]):
self.checks = checks

def verify(self, original: str, transformed: str) -> bool:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if there are existing usages of VerificationEngine with the documented API
rg -n "VerificationEngine\(" --type py -C 3

Repository: Refactron-ai/Refactron_lib

Length of output: 10031


🏁 Script executed:

#!/bin/bash
# Get the actual VerificationEngine class definition
rg -n "class VerificationEngine" --type py -A 30

Repository: Refactron-ai/Refactron_lib

Length of output: 4535


🏁 Script executed:

#!/bin/bash
# Check if TestSuiteGate is implemented and what it's called
rg -n "TestSuiteGate\|test_gate" --type py -C 2

Repository: Refactron-ai/Refactron_lib

Length of output: 52


🏁 Script executed:

#!/bin/bash
# Verify the verify() method signature in engine.py
rg -n "def verify\(" refactron/verification/engine.py -A 5

Repository: Refactron-ai/Refactron_lib

Length of output: 1208


🏁 Script executed:

#!/bin/bash
# Get the full engine.py file to see the context between the two class definitions
cat -n refactron/verification/engine.py | head -110

Repository: Refactron-ai/Refactron_lib

Length of output: 4436


🏁 Script executed:

#!/bin/bash
# Check if there's any comment or transition between the two class definitions
sed -n '70,85p' refactron/verification/engine.py | cat -n

Repository: Refactron-ai/Refactron_lib

Length of output: 585


Remove duplicate class definitions and unreachable code.

The file contains duplicate definitions of both BaseCheck (lines 12–18) and VerificationEngine (lines 20–58), followed by a floating docstring at line 59, redundant imports, and then the correct definitions (lines 70–104).

Lines 20–58 define an incomplete VerificationEngine that takes only checks and returns a bool, which is shadowed by the correct implementation at lines 81–104 that matches the documented API: __init__(project_root, checks) and verify(original, transformed, file_path) -> VerificationResult. The first definitions are unreachable and should be deleted entirely.

Delete lines 12–59 (both old BaseCheck and old VerificationEngine definitions, plus the orphaned docstring). Keep only the correct implementation at lines 70–104, which properly instantiates TestSuiteGate, SyntaxVerifier, and ImportIntegrityVerifier, and matches all test expectations and project guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@refactron/verification/engine.py` around lines 20 - 24, Remove the duplicate,
incomplete definitions of BaseCheck and VerificationEngine and the stray module
docstring so only the correct implementations remain: delete the old BaseCheck
class and the earlier VerificationEngine that defines __init__(self, checks) and
verify(self, original, transformed) and remove any orphaned docstring and
redundant imports they introduced; ensure the remaining VerificationEngine uses
the intended constructor signature (__init__(project_root, checks)) and
verify(original, transformed, file_path) behavior and still instantiates
TestSuiteGate, SyntaxVerifier, and ImportIntegrityVerifier as in the correct
implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant