Skip to content

feat: Add maintenance RPC endpoints for coordinated cache flushing#574

Open
AnkushinDaniil wants to merge 4 commits intomainfrom
daniil/feat/maintenance-rpc
Open

feat: Add maintenance RPC endpoints for coordinated cache flushing#574
AnkushinDaniil wants to merge 4 commits intomainfrom
daniil/feat/maintenance-rpc

Conversation

@AnkushinDaniil
Copy link
Copy Markdown
Collaborator

@AnkushinDaniil AnkushinDaniil commented Jan 21, 2026

Fixes #565
Implements Maintenance - coordinated state flush: Nitro and Nethermind coordinate trie persistence without disrupting block production.

sequenceDiagram
      participant Nitro as Nitro (Consensus)
      participant RPC as ArbitrumRpcModule
      participant Tracker as ProcessingTimeTracker
      participant Semaphore as CreateBlocksSemaphore
      participant WSM as WorldStateManager

      Note over Nitro,WSM: Normal Block Production

      loop Every Block
          Nitro->>RPC: DigestMessage(block)
          RPC->>Semaphore: WaitAsync(0)
          Semaphore-->>RPC: acquired
          RPC->>RPC: Process block
          RPC->>Semaphore: Release()
          RPC-->>Nitro: MessageResult
          Note over Tracker: AccumulatedTime += ProcessingMs
      end

      Note over Nitro,WSM: Maintenance Check (periodic)

      Nitro->>RPC: ShouldTriggerMaintenance()
      RPC->>Tracker: TimeBeforeFlush
      Tracker-->>RPC: TrieTimeLimitMs - (Accumulated + RandomOffset)

      alt TimeBeforeFlush ≤ Threshold
          RPC-->>Nitro: true

          Note over Nitro,WSM: Maintenance Triggered

          Nitro->>RPC: TriggerMaintenance()
          RPC->>RPC: _runningMaintenance = true
          RPC-->>Nitro: "OK" (immediate return)

          Note over RPC: Background Task Started

          RPC->>Semaphore: Wait() [blocking]
          Note over Semaphore: Blocks DigestMessage

          RPC->>WSM: FlushCache()
          WSM-->>RPC: success
          RPC->>Tracker: Reset()
          Note over Tracker: AccumulatedTime = 0<br/>RandomOffset = new random

          RPC->>Semaphore: Release()
          RPC->>RPC: _runningMaintenance = false

      else TimeBeforeFlush > Threshold
          RPC-->>Nitro: false
          Note over Nitro: Continue normal operation
      end

      Note over Nitro,WSM: Error Handling

      rect rgb(80, 40, 40)
          Note over RPC,WSM: If FlushCache throws:
          Note over RPC: _runningMaintenance = false<br/>Tracker NOT reset<br/>Can retry maintenance
      end

      Note over Nitro,WSM: Follower Mode Collision

      rect rgb(80, 70, 30)
          Note over Nitro,Semaphore: If DigestMessage during maintenance:
          Nitro->>RPC: DigestMessage(block)
          RPC->>Semaphore: WaitAsync(0)
          Semaphore-->>RPC: not acquired
          RPC-->>Nitro: "CreateBlock mutex held"
      end
Loading

@AnkushinDaniil AnkushinDaniil force-pushed the daniil/feat/maintenance-rpc branch from 61e0d72 to 60c8892 Compare January 22, 2026 08:56
@AnkushinDaniil AnkushinDaniil marked this pull request as ready for review January 22, 2026 10:34
Copilot AI review requested due to automatic review settings January 22, 2026 10:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds maintenance RPC endpoints for coordinated cache flushing in Arbitrum nodes. The implementation introduces a time-based tracking mechanism to monitor accumulated block processing time and provides RPC endpoints for external coordination of trie cache flush operations to prevent simultaneous flushes across multiple nodes.

Changes:

  • Added three new RPC endpoints: TriggerMaintenance(), ShouldTriggerMaintenance(), and MaintenanceStatus()
  • Implemented ProcessingTimeTracker to track accumulated block processing time with randomized offset support
  • Added configuration options for controlling maintenance timing thresholds and intervals

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/Nethermind.Arbitrum/Modules/ArbitrumRpcModule.cs Implements the three new maintenance RPC endpoints with background task execution
src/Nethermind.Arbitrum/Modules/IArbitrumRpcModule.cs Defines the RPC method signatures for maintenance endpoints
src/Nethermind.Arbitrum/Execution/ProcessingTimeTracker.cs Tracks accumulated processing time and calculates time before flush with random offset
src/Nethermind.Arbitrum/Execution/ArbitrumInitializeProcessingTimeTracker.cs Subscribes to blockchain processor events to track processing time
src/Nethermind.Arbitrum/Data/MaintenanceStatus.cs Simple DTO for maintenance status response
src/Nethermind.Arbitrum/Config/IArbitrumConfig.cs Adds configuration properties for maintenance thresholds and intervals
src/Nethermind.Arbitrum/Config/ArbitrumConfig.cs Implements default configuration values
src/Nethermind.Arbitrum/ArbitrumPlugin.cs Registers ProcessingTimeTracker as singleton and adds initialization step
src/Nethermind.Arbitrum/ArbitrumRpcModuleFactory.cs Updates factory to inject new dependencies
src/Nethermind.Arbitrum/Modules/ArbitrumRpcModuleWithComparison.cs Updates constructor signature for new dependencies
src/Nethermind.Arbitrum/Properties/scripts/generate-system-test-config.sh Adds Arbitrum configuration section to test config
src/Nethermind.Arbitrum.Test/Rpc/ArbitrumRpcModuleMaintenanceTests.cs Comprehensive test coverage for maintenance endpoints including edge cases
src/Nethermind.Arbitrum.Test/Execution/ProcessingTimeTrackerTests.cs Unit tests for time tracking logic
src/Nethermind.Arbitrum.Test/Execution/ArbitrumInitializeProcessingTimeTrackerTests.cs Tests for event subscription and time accumulation
src/Nethermind.Arbitrum.Test/Infrastructure/ThrowingWorldStateManagerDecorator.cs Test decorator for simulating FlushCache failures
src/Nethermind.Arbitrum.Test/Infrastructure/ArbitrumRpcTestBlockchain.cs Updates test infrastructure to support new dependencies
src/Nethermind.Arbitrum.Test/Rpc/ArbitrumRpcModuleTests.cs Updates existing tests with new constructor parameters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (Logger.IsInfo)
Logger.Info("Starting trie flush maintenance");

worldStateManager.FlushCache(CancellationToken.None);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I really don't understand why do we need to flush caches... I'd ask some knowledgeable person (Damian, Ashraf, Ben...) to verify we need any maintenance at all.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like manually overriding pruning behaviour. From the looks of what is done, I would say it's the correct thing, although maybe I do not understand the full picture. Is the overall goal here to ensure that we don't get any mismatch between DB state on execution and consensus side? Wondering if this can cause any issues with pruning? cc: @asdacap

@AnkushinDaniil AnkushinDaniil force-pushed the daniil/feat/maintenance-rpc branch from 05b571c to 2013718 Compare January 25, 2026 03:48
@github-actions
Copy link
Copy Markdown
Contributor

Code Coverage

Package Line Rate Branch Rate Health
Nethermind.Arbitrum 81% 58%
Summary 81% (9323 / 11541) 58% (2351 / 4024)

Minimum allowed line rate is 60%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Maintenance

4 participants