⚡ Bolt: [performance improvement] Optimize closure status query#554
⚡ Bolt: [performance improvement] Optimize closure status query#554RohanExploit wants to merge 2 commits intomainfrom
Conversation
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughThis PR implements a database query optimization pattern in grievance closure status computation by replacing multiple aggregate queries with a single grouped query, reducing database roundtrips. The optimization is documented in the bolt.md technical notes and validated through a new benchmark script. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
backend/tests/benchmark_closure_status.py (2)
102-134: Close the SQLAlchemy session on all exit paths.
run_benchmark()should closeSessionLocal()in afinallyblock so assertion failures or exceptions do not leave the session open.🧹 Suggested session cleanup
def run_benchmark(): - session = SessionLocal() - grievances = setup_data(session) + session = SessionLocal() + try: + grievances = setup_data(session) - g_id = grievances[0].id + g_id = grievances[0].id - # Warmup - for _ in range(10): - original_get_closure_status(session, g_id) - optimized_get_closure_status(session, g_id) + # Warmup + for _ in range(10): + original_get_closure_status(session, g_id) + optimized_get_closure_status(session, g_id) - ITERATIONS = 500 + ITERATIONS = 500 - ... + ... + finally: + session.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/benchmark_closure_status.py` around lines 102 - 134, The run_benchmark function currently creates a SQLAlchemy session via SessionLocal() and never guarantees it is closed on exceptions or assertion failures; wrap the body that uses session (grievances = setup_data(session) and all benchmarking code) in a try/finally (or use a context manager) and call session.close() in the finally block so the session is always closed; reference the run_benchmark function and the SessionLocal/session variables when making this change.
115-123: Avoid fixed execution order in benchmark timing.Running original first and optimized second every time can skew the reported delta. Alternate order (or run paired rounds and aggregate) to reduce warm-cache/order bias.
📊 Suggested benchmark-order fix
- start_time = time.time() - for _ in range(ITERATIONS): - res1 = original_get_closure_status(session, g_id) - original_time = time.time() - start_time - - start_time = time.time() - for _ in range(ITERATIONS): - res2 = optimized_get_closure_status(session, g_id) - optimized_time = time.time() - start_time + original_time = 0.0 + optimized_time = 0.0 + res1 = res2 = None + for i in range(ITERATIONS): + if i % 2 == 0: + t0 = time.time() + res1 = original_get_closure_status(session, g_id) + original_time += time.time() - t0 + + t0 = time.time() + res2 = optimized_get_closure_status(session, g_id) + optimized_time += time.time() - t0 + else: + t0 = time.time() + res2 = optimized_get_closure_status(session, g_id) + optimized_time += time.time() - t0 + + t0 = time.time() + res1 = original_get_closure_status(session, g_id) + original_time += time.time() - t0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/benchmark_closure_status.py` around lines 115 - 123, The benchmark currently calls original_get_closure_status then optimized_get_closure_status in fixed order which can bias timing; update the loop to alternate the call order per iteration (or perform paired runs and aggregate) so both original_get_closure_status and optimized_get_closure_status see similar warm/caching effects—use ITERATIONS to drive a loop that on even iterations calls original first and on odd calls optimized first (still recording times into original_time/optimized_time or accumulating totals), keep using session and g_id and preserve res1/res2 captures to avoid optimizer elision.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/tests/benchmark_closure_status.py`:
- Around line 102-134: The run_benchmark function currently creates a SQLAlchemy
session via SessionLocal() and never guarantees it is closed on exceptions or
assertion failures; wrap the body that uses session (grievances =
setup_data(session) and all benchmarking code) in a try/finally (or use a
context manager) and call session.close() in the finally block so the session is
always closed; reference the run_benchmark function and the SessionLocal/session
variables when making this change.
- Around line 115-123: The benchmark currently calls original_get_closure_status
then optimized_get_closure_status in fixed order which can bias timing; update
the loop to alternate the call order per iteration (or perform paired runs and
aggregate) so both original_get_closure_status and optimized_get_closure_status
see similar warm/caching effects—use ITERATIONS to drive a loop that on even
iterations calls original first and on odd calls optimized first (still
recording times into original_time/optimized_time or accumulating totals), keep
using session and g_id and preserve res1/res2 captures to avoid optimizer
elision.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d48867d9-c353-48f4-a2a7-23bf47042d85
📒 Files selected for processing (3)
.jules/bolt.mdbackend/routers/grievances.pybackend/tests/benchmark_closure_status.py
There was a problem hiding this comment.
Pull request overview
Optimizes the grievance closure-status endpoint by consolidating confirmation/dispute counting into a single GROUP BY aggregate query, reducing redundant DB round-trips. Adds a standalone benchmark script and documents the optimization as a Bolt learning.
Changes:
- Replace two separate confirmation/dispute count queries with one
GROUP BY confirmation_typequery in/api/grievances/{grievance_id}/closure-status. - Add a benchmark script to compare original vs optimized approaches.
- Document the “single-column mutually exclusive counts via GROUP BY” optimization in
.jules/bolt.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
backend/routers/grievances.py |
Consolidates confirmation/dispute counting into a single grouped aggregate query. |
backend/tests/benchmark_closure_status.py |
Adds a local benchmark harness to measure query round-trip reduction. |
.jules/bolt.md |
Captures the optimization guidance for future reference. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| stats = db.query( | ||
| ClosureConfirmation.confirmation_type, | ||
| func.count(ClosureConfirmation.id) | ||
| ).filter(ClosureConfirmation.grievance_id == grievance_id).group_by(ClosureConfirmation.confirmation_type).all() | ||
|
|
| import uuid | ||
| import random | ||
| from datetime import datetime, timezone | ||
| from sqlalchemy import create_engine, select | ||
| from sqlalchemy.orm import sessionmaker | ||
| from backend.models import Base, Grievance, GrievanceFollower, ClosureConfirmation, Jurisdiction, JurisdictionLevel, SeverityLevel, GrievanceStatus | ||
| from backend.database import get_db |
| return total_followers, confirmations_count, disputes_count | ||
|
|
||
| def optimized_get_closure_status(session, grievance_id): | ||
| from sqlalchemy import func, case |
💡 What: Consolidated two separate
.filter().count()aggregate queries into a single database roundtrip using a.group_by()query in the/api/grievances/{grievance_id}/closure-statusendpoint.🎯 Why: Performing multiple separate queries for mutually exclusive conditions on the same column (
confirmation_type) causes redundant database scans and increases network latency. A singleGROUP BYquery allows the database engine to calculate both counts in a single pass.📊 Impact: The optimization yields a ~35% performance improvement in execution time for this specific endpoint path, based on local benchmarks.
🔬 Measurement: A standalone benchmark test was constructed to compare
test_get_closure_status()across 500 iterations. Results showed time reduction from 0.76s down to 0.49s. The test suite also confirms no breaking changes to expected output schema or logic.PR created automatically by Jules for task 4741734040081525529 started by @RohanExploit
Summary by cubic
Optimized the closure status endpoint by consolidating multiple aggregate queries into a single
.group_by()aggregation, cutting DB roundtrips. Benchmarks show ~35% faster execution (0.76s → 0.49s over 500 iterations) with no API changes.confirmed/disputedcounts into oneGROUP BYquery in backend/routers/grievances.py, with zero defaults when a group is absent..group_by()for mutually exclusive values on a single column.Written for commit 099bca9. Summary will update on new commits.
Summary by CodeRabbit
Performance
Tests