Skip to content

fix/temporary tables#1257

Open
gfyrag wants to merge 2 commits intomainfrom
fix/temporary-tables
Open

fix/temporary tables#1257
gfyrag wants to merge 2 commits intomainfrom
fix/temporary-tables

Conversation

@gfyrag
Copy link
Contributor

@gfyrag gfyrag commented Feb 6, 2026

  • feat: add replica identity on tables without primary key
  • fix: potentially orphan temporary tables

@gfyrag gfyrag requested a review from a team as a code owner February 6, 2026 22:41
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

Walkthrough

Database migrations updated to add idempotent temporary table cleanup using DROP TABLE IF EXISTS statements. An additional migration introduces replica identity configuration for ledger tables via dynamic SQL execution in a DO block.

Changes

Cohort / File(s) Summary
Temporary Table Idempotency
internal/storage/bucket/migrations/11-make-stateless/up.sql, internal/storage/bucket/migrations/17-moves-fill-transaction-id/up.sql, internal/storage/bucket/migrations/18-transactions-fill-inserted-at/up.sql, internal/storage/bucket/migrations/20-accounts-volumes-fill-history/up.sql
Add DROP TABLE IF EXISTS before temporary table creation and ON COMMIT DROP clauses to ensure safe, idempotent cleanup and auto-removal at transaction commit.
Replica Identity Configuration
internal/storage/bucket/migrations/45-add-replica-identity/up.sql
New migration that iterates over ledgers and sets REPLICA IDENTITY using specific indexes for transactions, accounts, and logs tables via dynamic SQL execution within a DO block.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐰 Hop, hop, temporary tables clean,
DROP IF EXISTS keeps our ledgers keen,
ON COMMIT drops them, neat and quick,
And REPLICA IDENTITY's the final trick!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'fix/temporary tables' is vague and generic, using non-descriptive terms that don't convey meaningful information about what was actually changed. Provide a more specific title that describes the main changes, such as 'Add replica identity configuration and fix temporary table cleanup in migrations' or split into separate PRs for each concern.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description lists two distinct changes (replica identity feature and temporary table fixes), which aligns with the actual changeset content.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/temporary-tables

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@internal/storage/bucket/migrations/20-accounts-volumes-fill-history/up.sql`:
- Around line 8-9: Remove the ON COMMIT DROP clause from the temporary table
creation so tmp_volumes survives intermediate COMMITs inside the DO block;
locate the CREATE TEMPORARY TABLE tmp_volumes ON COMMIT DROP AS statement and
change it to a plain temporary table creation (matching migration 17's pattern),
and ensure an explicit DROP TABLE IF EXISTS tmp_volumes is executed after the
loop/DO block to clean up.

In `@internal/storage/bucket/migrations/45-add-replica-identity/up.sql`:
- Around line 8-17: Remove the unnecessary loop "for ledger in select * from
_system.ledgers loop" (and its loop-end) and the per-iteration use of the vsql
variable so the three ALTER TABLE statements run exactly once for the current
schema; specifically keep/emit the three EXECUTE statements that run "ALTER
TABLE transactions REPLICA IDENTITY USING INDEX transactions_ledger", "ALTER
TABLE accounts REPLICA IDENTITY USING INDEX accounts_ledger", and "ALTER TABLE
logs REPLICA IDENTITY USING INDEX logs_ledger" outside of any loop (remove the
loop and the unused loop variable `ledger` and leave the EXECUTE calls once).
🧹 Nitpick comments (1)
internal/storage/bucket/migrations/45-add-replica-identity/up.sql (1)

9-16: Missing idempotency guard — ALTER TABLE ... REPLICA IDENTITY USING INDEX will error if the index doesn't exist or the identity is already set to something incompatible.

Consider wrapping each ALTER TABLE in an IF NOT EXISTS-style check (e.g., querying pg_class.relreplident) or using an exception handler, especially since this migration may be re-run on schemas in different states.

Comment on lines +8 to +9
drop table if exists tmp_volumes;
create temporary table tmp_volumes on commit drop as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: ON COMMIT DROP will destroy tmp_volumes after the first batch COMMIT at Line 47, breaking all subsequent loop iterations.

Since the DO block issues intermediate COMMIT statements (Line 47) for batch processing, each commit ends the current transaction and starts a new one. The ON COMMIT DROP clause causes PostgreSQL to drop the temp table at each commit boundary. The second batch iteration will fail with a "relation tmp_volumes does not exist" error.

Remove ON COMMIT DROP to match the pattern used in migration 17 (plain temp table + explicit drop after the loop):

Proposed fix
 		drop table if exists tmp_volumes;
-		create temporary table tmp_volumes on commit drop as
+		create temporary table tmp_volumes as
 		select distinct on (ledger, accounts_address, asset)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
drop table if exists tmp_volumes;
create temporary table tmp_volumes on commit drop as
drop table if exists tmp_volumes;
create temporary table tmp_volumes as
select distinct on (ledger, accounts_address, asset)
🤖 Prompt for AI Agents
In `@internal/storage/bucket/migrations/20-accounts-volumes-fill-history/up.sql`
around lines 8 - 9, Remove the ON COMMIT DROP clause from the temporary table
creation so tmp_volumes survives intermediate COMMITs inside the DO block;
locate the CREATE TEMPORARY TABLE tmp_volumes ON COMMIT DROP AS statement and
change it to a plain temporary table creation (matching migration 17's pattern),
and ensure an explicit DROP TABLE IF EXISTS tmp_volumes is executed after the
loop/DO block to clean up.

Comment on lines +8 to +17
for ledger in select * from _system.ledgers where bucket = current_schema loop
vsql = 'ALTER TABLE transactions REPLICA IDENTITY USING INDEX transactions_ledger';
execute vsql;

vsql = 'ALTER TABLE accounts REPLICA IDENTITY USING INDEX accounts_ledger';
execute vsql;

vsql = 'ALTER TABLE logs REPLICA IDENTITY USING INDEX logs_ledger';
execute vsql;
end loop;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, read the complete migration file to see full context
cat internal/storage/bucket/migrations/45-add-replica-identity/up.sql

Repository: formancehq/ledger

Length of output: 546


🏁 Script executed:

# Search for where these indexes are created to understand if they're per-ledger
rg -n "CREATE.*INDEX.*(transactions_ledger|accounts_ledger|logs_ledger)" --type sql -B2 -A2

Repository: formancehq/ledger

Length of output: 43


🏁 Script executed:

# Look for other migrations to understand table and ledger structure
ls -la internal/storage/bucket/migrations/ | head -20

Repository: formancehq/ledger

Length of output: 1491


🏁 Script executed:

# Search for how ledger is used in similar migration patterns
rg -n "for ledger in select" internal/storage/bucket/migrations/ -A10

Repository: formancehq/ledger

Length of output: 9897


🏁 Script executed:

# Search for the index creation for these specific indexes
rg -n "CREATE.*INDEX.*transactions_ledger|accounts_ledger|logs_ledger" internal/storage/bucket/migrations/ --type sql

Repository: formancehq/ledger

Length of output: 1472


🏁 Script executed:

# Also check the initial schema to see if these indexes are defined there
cat internal/storage/bucket/migrations/0-init-schema/up.sql | grep -n "transactions_ledger\|accounts_ledger\|logs_ledger"

Repository: formancehq/ledger

Length of output: 276


The ledger loop variable is never used — the loop redundantly executes identical ALTER TABLE statements once per ledger when they should execute once per schema.

The indexes transactions_ledger, accounts_ledger, and logs_ledger are schema-global (created in migration 0-init-schema as unique indexes on the shared tables, with the ledger column as part of the composite key). They do not differ per ledger, so the loop is unnecessary overhead.

Remove the loop and execute the three ALTER TABLE statements once. (Note: migration 45-fix-missing-primary-keys already uses these same indexes without looping, confirming this is the correct approach.)

🤖 Prompt for AI Agents
In `@internal/storage/bucket/migrations/45-add-replica-identity/up.sql` around
lines 8 - 17, Remove the unnecessary loop "for ledger in select * from
_system.ledgers loop" (and its loop-end) and the per-iteration use of the vsql
variable so the three ALTER TABLE statements run exactly once for the current
schema; specifically keep/emit the three EXECUTE statements that run "ALTER
TABLE transactions REPLICA IDENTITY USING INDEX transactions_ledger", "ALTER
TABLE accounts REPLICA IDENTITY USING INDEX accounts_ledger", and "ALTER TABLE
logs REPLICA IDENTITY USING INDEX logs_ledger" outside of any loop (remove the
loop and the unused loop variable `ledger` and leave the EXECUTE calls once).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant