fix: coalesce ledger WAL flushes and extend to non-authority nodes by skylar-simoncelli · Pull Request #1218 · midnightntwrk/midnight-node

skylar-simoncelli · 2026-04-01T19:47:44Z

Overview

Note: This PR is based on feat/ledger_enact_parity_db_logs and should be reviewed alongside that branch. It adds three improvements to the ledger parity-db WAL flush task.

Changes

1. Flush coalescing (prevents task queue buildup)

The current code in feat/ledger_enact_parity_db_logs spawns a new spawn_blocking task for every BlockOrigin::Own notification. If a flush takes longer than the 6-second block interval, tasks accumulate without bound.

Added an AtomicBool flag (flush_in_progress) — if a flush is already running when the next notification arrives, we skip rather than spawn another task. This guarantees at most one flush is running at any time.

2. Non-authority node coverage

The current code only flushes on BlockOrigin::Own, which means non-authority nodes (RPCs, bootnodes, bridges, semi-trusted RPCs) never trigger a flush. Their ledger WAL grows until the 64 MB threshold causes a synchronous stall, which can cause:

RPC request timeouts
Peer disconnections when the node appears unresponsive

Non-authority nodes now flush every 50 imported blocks — frequent enough to prevent WAL buildup, infrequent enough to avoid excessive I/O.

3. Task moved outside authority gate

The flush task is no longer inside if role.is_authority(), so it runs on all node types.

Relationship to other PRs

Based on: feat/ledger_enact_parity_db_logs (ledger WAL flush on block import)
Companion to: fix: ensure parity-db WAL is drained on SIGTERM shutdown #1140 (Substrate chain DB shutdown drain fix)

Together these two fixes address the full parity-db WAL problem:

feat/ledger_enact_parity_db_logs + this PR → prevents runtime WAL stalls on both authority and non-authority nodes
fix: ensure parity-db WAL is drained on SIGTERM shutdown #1140 → prevents chain-state truncation on shutdown

Context

Discovered during investigation of chain-state truncation after unclean shutdown (#1140). While testing on guardnet, we measured every node having ~9,000-10,000 blocks of metadata sitting only in the WAL at any given time, confirming the WAL accumulation problem.

📌 Submission Checklist

Changes are backward-compatible (or flagged if breaking)
Pull request description explains why the change is needed
Self-reviewed the diff
I have included a change file, or skipped for this reason: improvement to unreleased feature branch
If the changes introduce a new feature, I have bumped the node minor version
No new todos introduced

🧪 Testing Evidence

Logic-only change to an unreleased feature. The coalescing and interval flush are safe additive behaviors on top of the existing flush mechanism.

🔱 Fork Strategy

Node Client Update

Three improvements to the ledger parity-db WAL flush task: 1. Add flush coalescing via AtomicBool — if a flush is already in progress when the next block notification arrives, skip rather than spawning another spawn_blocking task. Prevents unbounded task queue buildup when flush duration exceeds the 6-second block interval. 2. Extend WAL flushing to non-authority nodes (RPCs, bootnodes, bridges). These nodes never author blocks so BlockOrigin::Own never matches, leaving their WAL to grow until the 64 MB threshold causes a synchronous stall. Non-authority nodes now flush every 50 imported blocks. 3. Move the flush task outside the `if role.is_authority()` block so it runs for all node types.

…e on panic

Klapeyron and others added 2 commits March 30, 2026 10:04

feat: added flush of parity-db logs after own block import

61b15b9

skylar-simoncelli requested a review from a team as a code owner April 1, 2026 19:47

skylar-simoncelli added skip-changes-check-all skip-changes-check-jira labels Apr 1, 2026

fix: use drop guard for flush_in_progress to prevent permanent disabl…

d7f96f4

…e on panic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: coalesce ledger WAL flushes and extend to non-authority nodes#1218

fix: coalesce ledger WAL flushes and extend to non-authority nodes#1218
skylar-simoncelli wants to merge 3 commits intomainfrom
skylar/improve-ledger-wal-flush

skylar-simoncelli commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

skylar-simoncelli commented Apr 1, 2026

Overview

Changes

1. Flush coalescing (prevents task queue buildup)

2. Non-authority node coverage

3. Task moved outside authority gate

Relationship to other PRs

Context

📌 Submission Checklist

🧪 Testing Evidence

🔱 Fork Strategy

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants