Skip to content

Conversation

@thlorenz
Copy link
Contributor

@thlorenz thlorenz commented Nov 6, 2025

Summary by CodeRabbit

  • New Features

    • Improved reconnection with backoff, health checks, per-connection abort signaling, and APIs to list subscription counts and current subscriptions.
    • Config option to enable/disable subscription metrics with a fallible constructor; background updater for monitored subscriptions.
    • Exposed subscription cache inspection: counts, pubkeys, never-evicted.
  • Bug Fixes

    • Reduced noisy warnings to debug for account-ensure failures.
    • Clarified delegation-record decoding error message.
  • Tests

    • Added large-scale subscription-limits integration test and updated tests for reconnection/metrics behavior.
  • Chores

    • Metrics API renamed: adjust_monitored_accounts_count → set_monitored_accounts_count.
    • Removed an optional logging feature flag from workspace dependency.

Enhances subscription management and reliability by adding metrics, and robust reconnection logic with automatic resubscription.

Details

Subscription Management Improvements

  • Atomic Subscribe/Unsubscribe: Improved subscribe logic to handle evictions and rollbacks atomically, preventing race conditions between LRU cache and pubsub client
  • Enhanced Error Handling: Better error propagation and logging for subscription failures

Metrics and Monitoring

  • Subscription Metrics: Added subscription_count method to ChainPubsubClient trait for tracking active subscriptions across all clients
  • Monitored Accounts Gauge: Updated metrics to accurately report monitored accounts count
  • Configurable Metrics: Added metrics enable/disable flag to RemoteAccountProviderConfig

Reconnection and Reliability

  • Reconnection Logic: Replaced periodic connection recycling with event-driven reconnection using the new ReconnectableClient trait
  • Fibonacci Backoff: Implemented exponential backoff with fibonacci delays for reconnection attempts (max 10 minutes)
  • Automatic Resubscription: Clients automatically resubscribe to existing subscriptions after successful reconnection
  • Improved SubMuxClient: Simplified constructor and removed deprecated recycling logic

Testing and Integration

  • Integration Test: Added 07_subscription_limits.rs to test large-scale subscription scenarios (400 accounts)
  • Reconnection Tests: Comprehensive tests for reconnection logic with failed resubscription recovery
  • Updated Test Configurations: All tests now use metrics-enabled configurations

Code Quality

  • Trait Refactoring: Introduced ReconnectableClient trait for better abstraction
  • Logging Improvements: Enhanced debug/trace logging throughout subscription and reconnection flows
  • Dependency Updates: Added ureq and url for metrics fetching in integration tests

* master:
  hotfix: unregistered metric COMMITTOR_INTENT_CU_USAGE (#612)
  feat: add metrics for intent execution time (#607)
  fix: don't increment subscription id for dups (#610)
  chore: broaden dependency range on the magic program api (#609)
  feat: parallel CI integration tests (#605)
  Handle uncleaned buffers + generalized tx send retries (#597)
  fix: support binary opcode (#604)
  fix: correct log imports (#606)
@github-actions
Copy link

github-actions bot commented Nov 6, 2025

Manual Deploy Available

You can trigger a manual deploy of this PR branch to testnet:

Deploy to Testnet 🚀

Alternative: Comment /deploy on this PR to trigger deployment directly.

⚠️ Note: Manual deploy requires authorization. Only authorized users can trigger deployments.

Comment updated automatically when the PR is synchronized.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 6, 2025

Walkthrough

Replace PubsubClient with PubSubConnection; add reconnect/abort flow, health probes, per-client reconnect orchestration, and CancellationTokens; switch subscription storage to an LRU with optional subscription metrics and a background updater; extend ChainPubsubClient with reconnection APIs; update logging, tests, and workspace manifests. (50 words)

Changes

Cohort / File(s) Summary
Workspace manifest
Cargo.toml, test-integration/Cargo.toml, test-integration/test-tools/Cargo.toml
Removed features = ["release_max_level_info"] from workspace log dependency; added ureq and url to test workspaces.
Aperture logging
magicblock-aperture/src/requests/http/mod.rs, magicblock-aperture/src/requests/http/send_transaction.rs
Switched a warn!debug! in read_account_with_ensure; replaced explicit logging macro imports with use log::*;.
Chainlink deps
magicblock-chainlink/Cargo.toml
Added arc-swap = "1.7"; added magicblock-metrics = { workspace = true }; re-added magicblock-delegation-program = { workspace = true }.
Chainlink error text
magicblock-chainlink/src/chainlink/errors.rs
Updated InvalidDelegationRecord display string to "Delegation record could not be decoded: ...".
Fetch cloner
magicblock-chainlink/src/chainlink/fetch_cloner.rs
Iterate pubkeys by value; skip accounts already in bank that are delegated/blacklisted/watched; adjust pending-request/oneshot handling and receiver error logging; update tests to use try_new_with_metrics(...).unwrap().
PubSub actor
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs
Replace PubsubClient with PubSubConnection; add client_id, is_connected, abort_sender; per-subscription CancellationToken; Reconnect flow, abort signaling, health probe, 2s unsubscribe timeout, subscription_count/subscriptions helpers, and expanded logging.
PubSub client & mocks
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs
Added PubSubConnection abstraction and ReconnectableClient trait; extended ChainPubsubClient with subscription_count/subscriptions/try_reconnect/resub_multiple; enhanced mock to simulate disconnects/failures and new APIs; added reconnect/backoff and probe logic.
Remote provider config
magicblock-chainlink/src/remote_account_provider/config.rs
Added enable_subscription_metrics: bool; new fallible constructor try_new_with_metrics(...) -> Result<...>; try_new delegates to it; added getter and default_with_lifecycle_mode; Default updated.
LRU cache API
magicblock-chainlink/src/remote_account_provider/lru_cache.rs
Added len(), never_evicted_accounts(), and pubkeys() accessors; unit test for never-evicted set.
Remote provider core
magicblock-chainlink/src/remote_account_provider/mod.rs
Replace subscribed_accounts with Arc<AccountsLruCache>; add background start_active_subscriptions_updater and optional handle; add register_subscription flow; change fetch channels to carry FetchResult = Result<RemoteAccount, RemoteAccountProviderError>; refactor subscription/fetch flows and tests.
SubMux reconnection
magicblock-chainlink/src/submux/mod.rs
Bound SubMuxClient<T> by ReconnectableClient; clients passed as (Arc<T>, mpsc::Receiver<()>); added reconnectors/backoff/resubscribe logic; added subscription_count/subscriptions; removed legacy recycler/config.
Testing actor changes
magicblock-chainlink/src/testing/chain_pubsub.rs, other testing/*
Tests updated to pass an abort_sender into ChainPubsubActor::new_from_url; replaced recycle helper with reconnect; updated message variant usage.
Metrics API
magicblock-metrics/src/metrics/mod.rs
Removed SUBSCRIPTIONS_COUNT_GAUGE and set_subscriptions_count; renamed adjust_monitored_accounts_countset_monitored_accounts_count.
Test tooling: metrics fetcher
test-integration/test-tools/src/integration_test_context.rs
Added IntegrationTestContext::get_monitored_accounts_count(&self, port: u16) -> Result<usize> using ureq + url to fetch/parse mbv_monitored_accounts.
Test config updates
test-integration/test-chainlink/src/test_context.rs, various test-integration tests
Switched construction of RemoteAccountProviderConfig to try_new_with_metrics(...).unwrap() and updated callsites.
New integration test
test-integration/test-cloning/tests/07_subscription_limits.rs
Added (ignored) integration test exercising many subscriptions, delegations, and monitored-account metric validation.
Chainlink account reset
magicblock-chainlink/src/chainlink/mod.rs
Reworked reset logic to use AtomicU64 counters (delegated/dlp_owned_not_delegated/blacklisted/remaining) and log a detailed info summary; adjusted some fetch logs from trace→debug.
Accounts DB small change
magicblock-accounts-db/src/lib.rs
remove_where now collects matching keys into a HashSet instead of a Vec.
Processor filtering
magicblock-processor/src/executor/processing.rs
Skip persisting accounts with lamports == 0 in commit path; added ReadableAccount import and comments.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant SubMux
    participant ChainActor as ChainPubsubActor
    participant PubSubConn as PubSubConnection
    participant Abort as Abort Sender

    rect rgb(240,248,255)
      Note over Client,ChainActor: Subscribe path (LRU + register)
      Client->>SubMux: subscribe(pubkey)
      SubMux->>ChainActor: register_subscription(pubkey)
      ChainActor->>PubSubConn: account_subscribe(pubkey)
      PubSubConn-->>ChainActor: subscription_stream (updates)
    end

    rect rgb(255,240,245)
      Note over PubSubConn,Abort: Connection error -> abort & reconnect
      PubSubConn--xChainActor: connection failure
      ChainActor->>Abort: send abort()
      Abort->>SubMux: notify reconnection task
      SubMux->>SubMux: reconnect_client_with_backoff()
      SubMux->>ChainActor: try_reconnect()
      ChainActor->>PubSubConn: reconnect probe (subscribe/unsubscribe)
      PubSubConn-->>ChainActor: probe success
      ChainActor->>SubMux: resubscribe restored subs
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas to inspect closely:
    • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs — reconnect state machine, CancellationTokens, abort signaling, subscription cleanup and unsubscribe timeout.
    • magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs — PubSubConnection, try_reconnect/resub_multiple, backoff and concurrency guards, mock behavior.
    • magicblock-chainlink/src/remote_account_provider/mod.rs — LRU eviction atomicity, register_subscription, FetchResult propagation to waiting requesters.
    • magicblock-chainlink/src/submux/mod.rs — per-client abort channel wiring and resubscription orchestration.
    • Tests and call sites changed to use fallible try_new_with_metrics(...).unwrap() — ensure unwraps are acceptable.

Possibly related issues

Possibly related PRs

Suggested reviewers

  • GabrielePicco
  • bmuddha

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.47% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: enhance subscription management with metrics and reconnection' clearly and concisely summarizes the main changes—improved subscription handling, metrics tracking, and reconnection logic—which aligns with the primary objectives of the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch thlorenz/subscription-metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)

1027-1040: Fix the function name in the log message and remove duplicate error logging.

Line 1032: The log message references "FetchCloner::clone_accounts" but the actual function is fetch_and_clone_accounts_with_dedup.

Lines 1031-1038: There's redundant error logging—inspect_err logs a detailed warning (including the pubkey), and then the error! macro logs the same error again without the pubkey. This creates noise and the second log provides less information.

Apply this diff to fix the function name and remove the redundant log:

                 if let Err(err) = receiver
                     .await
                     .inspect_err(|err| {
-                        warn!("FetchCloner::clone_accounts - RecvError occurred while awaiting account {}: {err:?}. This indicates the account fetch sender was dropped without sending a value.", pubkey);
+                        warn!("FetchCloner::fetch_and_clone_accounts_with_dedup - RecvError occurred while awaiting account {}: {err:?}. This indicates the account fetch sender was dropped without sending a value.", pubkey);
                     })
                 {
-                    // The sender was dropped, likely due to an error in the other request
-                    error!(
-                        "Failed to receive account from pending request: {err}"
-                    );
+                    // Error already logged by inspect_err above
                 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7955c3 and cf9082a.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • test-integration/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • Cargo.toml (1 hunks)
  • magicblock-aperture/src/requests/http/mod.rs (1 hunks)
  • magicblock-aperture/src/requests/http/send_transaction.rs (1 hunks)
  • magicblock-chainlink/Cargo.toml (2 hunks)
  • magicblock-chainlink/src/chainlink/errors.rs (1 hunks)
  • magicblock-chainlink/src/chainlink/fetch_cloner.rs (2 hunks)
  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (8 hunks)
  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (11 hunks)
  • magicblock-chainlink/src/remote_account_provider/config.rs (4 hunks)
  • magicblock-chainlink/src/remote_account_provider/lru_cache.rs (2 hunks)
  • magicblock-chainlink/src/remote_account_provider/mod.rs (24 hunks)
  • magicblock-chainlink/src/submux/mod.rs (20 hunks)
  • magicblock-chainlink/src/testing/chain_pubsub.rs (1 hunks)
  • magicblock-metrics/src/metrics/mod.rs (1 hunks)
  • test-integration/Cargo.toml (1 hunks)
  • test-integration/test-chainlink/src/test_context.rs (1 hunks)
  • test-integration/test-chainlink/tests/ix_exceed_capacity.rs (1 hunks)
  • test-integration/test-chainlink/tests/ix_remote_account_provider.rs (1 hunks)
  • test-integration/test-cloning/tests/07_subscription_limits.rs (1 hunks)
  • test-integration/test-tools/Cargo.toml (1 hunks)
  • test-integration/test-tools/src/integration_test_context.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/lru_cache.rs
  • magicblock-aperture/src/requests/http/mod.rs
  • test-integration/test-chainlink/src/test_context.rs
  • test-integration/test-cloning/tests/07_subscription_limits.rs
  • magicblock-chainlink/src/remote_account_provider/config.rs
  • magicblock-chainlink/src/remote_account_provider/mod.rs
  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs
  • magicblock-chainlink/src/submux/mod.rs
  • magicblock-metrics/src/metrics/mod.rs
  • magicblock-chainlink/src/testing/chain_pubsub.rs
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.

Applied to files:

  • test-integration/test-chainlink/tests/ix_remote_account_provider.rs
  • test-integration/test-cloning/tests/07_subscription_limits.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs
🧬 Code graph analysis (12)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)
  • new (262-317)
test-integration/test-chainlink/src/test_context.rs (2)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
  • try_new_with_metrics (27-42)
  • lifecycle_mode (51-53)
magicblock-chainlink/src/remote_account_provider/mod.rs (3)
  • try_from_clients_and_mode (173-192)
  • rpc_client (1016-1018)
  • rpc_client (1028-1030)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (2)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (6)
  • new (126-158)
  • subscription_count (177-192)
  • subscriptions (165-170)
  • subscriptions (194-203)
  • try_reconnect (454-492)
  • new_from_url (116-124)
magicblock-chainlink/src/submux/mod.rs (3)
  • new (147-158)
  • subscription_count (620-636)
  • subscriptions (641-643)
test-integration/test-chainlink/tests/ix_exceed_capacity.rs (1)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
  • try_new_with_metrics (27-42)
  • subscribed_accounts_lru_capacity (55-57)
test-integration/test-tools/src/integration_test_context.rs (1)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (1)
  • url (56-58)
test-integration/test-chainlink/tests/ix_remote_account_provider.rs (1)
magicblock-chainlink/src/remote_account_provider/config.rs (1)
  • try_new_with_metrics (27-42)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
magicblock-chainlink/src/remote_account_provider/config.rs (1)
  • try_new_with_metrics (27-42)
magicblock-chainlink/src/remote_account_provider/mod.rs (6)
magicblock-metrics/src/metrics/mod.rs (1)
  • set_monitored_accounts_count (320-322)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (1)
  • new (126-158)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (3)
  • new (46-54)
  • new (311-322)
  • try_new_from_url (165-180)
magicblock-chainlink/src/submux/mod.rs (1)
  • new (147-158)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (4)
  • new (25-34)
  • pubkeys (38-41)
  • pubkeys (129-135)
  • len (117-123)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
  • default (65-72)
  • try_new_with_metrics (27-42)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (4)
magicblock-chainlink/src/remote_account_provider/mod.rs (4)
  • new (262-317)
  • default (115-121)
  • unsubscribe (771-799)
  • update (422-422)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (16)
  • new (46-54)
  • new (311-322)
  • url (56-58)
  • subscription_count (135-138)
  • subscription_count (237-248)
  • subscription_count (414-429)
  • subscriptions (140-140)
  • subscriptions (250-252)
  • subscriptions (431-434)
  • try_reconnect (147-147)
  • try_reconnect (257-266)
  • try_reconnect (439-442)
  • unsubscribe (123-126)
  • unsubscribe (219-235)
  • unsubscribe (402-410)
  • send (335-344)
magicblock-chainlink/src/submux/mod.rs (4)
  • new (147-158)
  • subscription_count (620-636)
  • subscriptions (641-643)
  • unsubscribe (587-595)
magicblock-chainlink/src/testing/chain_pubsub.rs (1)
  • unsubscribe (45-57)
magicblock-chainlink/src/submux/mod.rs (2)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (4)
  • new (126-158)
  • subscription_count (177-192)
  • subscriptions (165-170)
  • subscriptions (194-203)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (8)
  • new (46-54)
  • new (311-322)
  • subscription_count (135-138)
  • subscription_count (237-248)
  • subscription_count (414-429)
  • subscriptions (140-140)
  • subscriptions (250-252)
  • subscriptions (431-434)
magicblock-metrics/src/metrics/mod.rs (1)
magicblock-aperture/src/state/subscriptions.rs (2)
  • count (335-337)
  • count (370-372)
magicblock-chainlink/src/testing/chain_pubsub.rs (1)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (1)
  • new_from_url (116-124)
🪛 GitHub Actions: Run CI - Format
test-integration/test-chainlink/tests/ix_remote_account_provider.rs

[error] 41-41: Rustfmt formatting check failed. Diff detected; run 'cargo +nightly fmt' (or 'cargo fmt') to fix formatting, then re-run the check.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Project
  • GitHub Check: run_make_ci_lint
  • GitHub Check: run_make_ci_test
🔇 Additional comments (11)
magicblock-chainlink/src/chainlink/errors.rs (1)

21-21: LGTM! Error message improved for clarity.

The addition of "record" makes the error message more specific and precise without any breaking changes to the error variant structure.

magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)

1507-1512: LGTM!

The test configuration correctly uses the new try_new_with_metrics API with appropriate parameters (LRU capacity: 1000, lifecycle mode: Ephemeral, metrics disabled) and properly handles the Result with unwrap() in the test context.

magicblock-aperture/src/requests/http/mod.rs (1)

115-115: The review comment is valid; maintain with minor refinement.

The verification confirms the inconsistency exists:

  • Line 115 uses debug! for single account ensure failures
  • Line 137 uses warn! for multiple accounts ensure failures
  • Both call the same chainlink.ensure_accounts() method
  • Both have identical fallback behavior and error handling comments

The logging level mismatch between semantically similar operations is a legitimate code quality concern worth addressing. The review comment's suggestions are reasonable and actionable.

test-integration/test-tools/Cargo.toml (1)

14-15: Metrics HTTP deps look good

Adding ureq and url via the workspace keeps the new metrics helper lean and consistent with the rest of the tooling.

test-integration/Cargo.toml (1)

100-101: Workspace pin aligns with new helper

Thanks for pinning ureq and url here—this lets the test tooling reuse the common versions cleanly.

magicblock-aperture/src/requests/http/send_transaction.rs (1)

1-1: Import change is fine

Pulling in the full log prelude keeps the new trace/debug statements compiling without extra edits elsewhere.

magicblock-chainlink/Cargo.toml (1)

7-17: Dependency additions check out

arc-swap, magicblock-metrics, and magicblock-delegation-program line up with the cache/metrics work in this crate.

Cargo.toml (1)

94-94: Good call on log feature

Dropping the release max-level gate matches the added trace instrumentation so it remains available in release profiles.

test-integration/test-tools/src/integration_test_context.rs (1)

1156-1203: Metric fetch helper looks solid

Parsing the RPC URL to derive the metrics endpoint and validating the Prometheus gauge gives the tests the observability they need.

magicblock-chainlink/src/remote_account_provider/lru_cache.rs (2)

117-135: Accessor additions make sense

The new helpers expose the cache state safely for metrics/reporting without breaking the locking strategy.


261-269: Test covers never-evicted set

Nice to see the invariant enforced so the clock sysvar always survives LRU churn.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6283d23 and 22c1817.

📒 Files selected for processing (4)
  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (8 hunks)
  • magicblock-chainlink/src/testing/chain_pubsub.rs (2 hunks)
  • test-integration/test-chainlink/tests/chain_pubsub_actor.rs (3 hunks)
  • test-integration/test-chainlink/tests/chain_pubsub_client.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs
🧬 Code graph analysis (4)
test-integration/test-chainlink/tests/chain_pubsub_actor.rs (2)
magicblock-chainlink/src/testing/chain_pubsub.rs (4)
  • reconnect (59-68)
  • setup_actor_and_client (14-29)
  • subscribe (31-43)
  • unsubscribe (45-57)
test-integration/test-tools/src/integration_test_context.rs (1)
  • airdrop (610-635)
test-integration/test-chainlink/tests/chain_pubsub_client.rs (1)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (1)
  • try_new_from_url (165-180)
magicblock-chainlink/src/testing/chain_pubsub.rs (2)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (1)
  • new_from_url (116-124)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (1)
  • reconnect (81-111)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (3)
magicblock-chainlink/src/remote_account_provider/mod.rs (4)
  • new (262-317)
  • default (115-121)
  • unsubscribe (771-799)
  • update (422-422)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (16)
  • new (46-54)
  • new (311-322)
  • url (56-58)
  • subscription_count (135-138)
  • subscription_count (237-248)
  • subscription_count (414-429)
  • subscriptions (140-140)
  • subscriptions (250-252)
  • subscriptions (431-434)
  • try_reconnect (147-147)
  • try_reconnect (257-266)
  • try_reconnect (439-442)
  • unsubscribe (123-126)
  • unsubscribe (219-235)
  • unsubscribe (402-410)
  • send (335-344)
magicblock-chainlink/src/submux/mod.rs (4)
  • new (147-158)
  • subscription_count (620-636)
  • subscriptions (641-643)
  • unsubscribe (587-595)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run_make_ci_test
  • GitHub Check: Build Project
🔇 Additional comments (8)
test-integration/test-chainlink/tests/chain_pubsub_actor.rs (2)

93-100: LGTM: Correctly validates reconnection behavior.

The test now properly validates that after reconnection, existing subscriptions are cleared and updates are not received until clients resubscribe. This aligns with the event-driven reconnection model described in the PR objectives.


154-167: LGTM: Comprehensive multi-account reconnection test.

The test validates that after reconnection:

  • All subscriptions are cleared (no updates for any account)
  • Selective resubscription works correctly (only first 3 accounts)
  • Previously unsubscribed accounts remain unsubscribed after reconnection
magicblock-chainlink/src/testing/chain_pubsub.rs (1)

59-68: LGTM: Clean API rename from recycle to reconnect.

The function correctly sends the renamed Reconnect message variant and handles the response appropriately.

magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (5)

115-158: LGTM: Constructor properly initializes connection management state.

The constructor correctly:

  • Adds abort_sender parameter for signaling connection issues
  • Uses static atomic CLIENT_ID for unique instance identification
  • Initializes PubSubConnection and new tracking fields (is_connected, client_id)

177-203: LGTM: Connectivity-aware subscription tracking.

The methods correctly return zero/empty results when disconnected, representing that no active subscriptions exist from an external perspective even if the internal map hasn't been cleaned up yet. This design prevents callers from making incorrect assumptions about subscription state during connection issues.


280-307: Verify: Subscribe/unsubscribe are ignored when disconnected.

When is_connected is false, both subscribe and unsubscribe requests are acknowledged but not processed. While this prevents operations on a dead connection, it means:

  1. Subscribe requests during disconnection are silently ignored (caller thinks subscription succeeded)
  2. Unsubscribe requests leave entries in the subscriptions map until reconnection cleanup

This behavior may be intentional to avoid errors during reconnection, but consider whether callers should be notified of the disconnected state rather than receiving an Ok(()) response.

Does the higher-level code (RemoteAccountProvider, SubMux) handle retry logic when these operations succeed but don't actually modify subscription state? The silent acknowledgment could mask issues if callers expect immediate effect.


442-454: LGTM: Timeout prevents hanging on dead sockets during cleanup.

The 2-second timeout for unsubscribe prevents the cleanup task from hanging indefinitely when the connection is dead. The subscription is removed from the map regardless of whether the unsubscribe succeeds or times out, ensuring proper cleanup.


497-534: LGTM: Robust abort signaling with duplicate prevention.

The method correctly:

  • Uses atomic swap to prevent duplicate aborts (line 504)
  • Drains and cancels all subscriptions before signaling
  • Uses try_send to avoid blocking, treating channel-full as expected during active reconnection (lines 528-529)

The ordering ensures consistent state: subscriptions are cleaned up before signaling the connection issue, preventing races where new subscribe attempts could occur before cleanup completes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1058443 and 31b49e5.

📒 Files selected for processing (1)
  • magicblock-chainlink/src/chainlink/fetch_cloner.rs (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (2)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (2)
  • pubkeys (38-41)
  • pubkeys (129-135)
magicblock-chainlink/src/remote_account_provider/config.rs (1)
  • try_new_with_metrics (27-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Project
  • GitHub Check: run_make_ci_test
  • GitHub Check: run_make_ci_lint
🔇 Additional comments (2)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (2)

1037-1044: Excellent improvement to error logging.

The addition of pubkey context in the error message significantly improves debuggability. Using inspect_err to log before propagating the error is idiomatic and helps trace which specific account fetch failed when multiple requests are pending.


1517-1522: LGTM!

The test correctly adopts the new try_new_with_metrics constructor with appropriate parameters for the test context (capacity=1000, ephemeral lifecycle, metrics disabled).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 908733f and 37b9d32.

📒 Files selected for processing (1)
  • magicblock-chainlink/src/chainlink/mod.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-07T13:20:13.754Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.754Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.

Applied to files:

  • magicblock-chainlink/src/chainlink/mod.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-chainlink/src/chainlink/mod.rs
🧬 Code graph analysis (1)
magicblock-chainlink/src/chainlink/mod.rs (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (2)
  • new (262-317)
  • pubkeys_str (1076-1082)
🔇 Additional comments (4)
magicblock-chainlink/src/chainlink/mod.rs (4)

1-4: LGTM! Necessary imports for atomic counters.

The addition of AtomicU64 and Ordering supports the enhanced metrics tracking introduced in the reset_accounts_bank method.


143-147: LGTM! Clear categorization with atomic counters.

The atomic counters provide excellent observability into the account removal process, tracking five distinct categories for detailed metrics.


149-175: Verify the TODO regarding log level adjustment.

The removal logic is correct and well-structured with clear priority handling (blacklisted → delegated → DLP-owned → remaining). However, line 163 has a TODO comment suggesting the debug log should eventually be moved to trace level.

Consider whether the debug-level logging for each removed non-delegated, non-DLP-owned account (lines 164-167) might be too verbose in production. The TODO suggests moving it to trace level—confirm whether this adjustment should be made now or tracked separately.


324-339: LGTM! Enhanced observability for fetch operations.

Elevating the log level from trace to debug and including the mark_empty_if_not_found parameter provides valuable context for debugging account fetch operations, aligning with the PR's observability improvements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)

617-624: Don’t clobber in-flight fetch waiters

Overwriting an existing fetching entry drops the previous waiters’ senders, so their receivers observe RecvError even though the original fetch is still running. Sharing the pending vector is required when two callers request the same pubkey concurrently.

Apply this diff to merge with in-flight requests:

         {
             let mut fetching = self.fetching_accounts.lock().unwrap();
             for &pubkey in pubkeys {
                 let (sender, receiver) = oneshot::channel();
-                fetching.insert(pubkey, (fetch_start_slot, vec![sender]));
+                if let Some((_slot, waiters)) = fetching.get_mut(&pubkey) {
+                    waiters.push(sender);
+                } else {
+                    fetching.insert(pubkey, (fetch_start_slot, vec![sender]));
+                }
                 subscription_overrides.push((pubkey, receiver));
             }
         }
♻️ Duplicate comments (2)
test-integration/test-cloning/tests/07_subscription_limits.rs (1)

71-73: Good: JoinSet failures are surfaced

Switch to iterating join_all results propagates panics/cancellations. Matches prior feedback.

magicblock-chainlink/src/remote_account_provider/mod.rs (1)

715-724: Reinsert the evicted key when unsubscribe fails

We now remove the new key on failure, but the evicted key never goes back into the LRU even though the pubsub unsubscribe didn’t happen. That leaves the cache claiming we aren’t watching the evicted account while the client still is.

Apply this diff to restore the prior state before returning:

             if let Err(err) = self.pubsub_client.unsubscribe(evicted).await {
                 warn!(
                     "Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}");
                 // Rollback the LRU add since eviction failed
                 self.lrucache_subscribed_accounts.remove(pubkey);
+                let _ = self.lrucache_subscribed_accounts.add(evicted);
                 return Err(err);
             }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b284277 and fdbaed2.

📒 Files selected for processing (2)
  • magicblock-chainlink/src/remote_account_provider/mod.rs (24 hunks)
  • test-integration/test-cloning/tests/07_subscription_limits.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.

Applied to files:

  • test-integration/test-cloning/tests/07_subscription_limits.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/mod.rs
🧬 Code graph analysis (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (5)
magicblock-metrics/src/metrics/mod.rs (1)
  • set_monitored_accounts_count (320-322)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (3)
  • new (46-54)
  • new (311-322)
  • try_new_from_url (165-180)
magicblock-chainlink/src/submux/mod.rs (1)
  • new (147-158)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (4)
  • new (25-34)
  • pubkeys (38-41)
  • pubkeys (129-135)
  • len (117-123)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
  • default (65-72)
  • try_new_with_metrics (27-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: run_make_ci_lint
  • GitHub Check: Build Project
  • GitHub Check: run_make_ci_test
🔇 Additional comments (2)
test-integration/test-cloning/tests/07_subscription_limits.rs (2)

16-17: No action required: metrics port constant matches configuration The hardcoded port 9000 in the test aligns with metrics.port = 9000 in test-integration/configs/cloning-conf.ephem.toml.


37-37: Ignore zero-lamport clamp—unnecessary
Rent::default() uses non-zero lamports_per_byte_year and exemption_threshold, so minimum_balance(0) is guaranteed to return > 0—no clamp needed. citeturn3web0

Likely an incorrect or invalid review comment.

Comment on lines +35 to +37
// TODO: need to delegate half those instead as part of airdropping
// that way we can test unsub
let rent_exempt_amount = Rent::default().minimum_balance(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove or reword stale TODO

The code already delegates half the accounts. Update the comment or drop it to avoid confusion.

-    // TODO: need to delegate half those instead as part of airdropping
-    // that way we can test unsub
+    // Delegate half the accounts; only the non-delegated half should be monitored.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// TODO: need to delegate half those instead as part of airdropping
// that way we can test unsub
let rent_exempt_amount = Rent::default().minimum_balance(0);
// Delegate half the accounts; only the non-delegated half should be monitored.
let rent_exempt_amount = Rent::default().minimum_balance(0);
🤖 Prompt for AI Agents
In test-integration/test-cloning/tests/07_subscription_limits.rs around lines 35
to 37, the TODO comment claiming "need to delegate half those instead as part of
airdropping" is stale because the code already delegates half the accounts;
remove or reword that TODO to accurately reflect current behavior (e.g., delete
the comment or change it to note that delegation is already performed), leaving
the rent_exempt_amount line unchanged.

Comment on lines +56 to +69
if idx % 2 == 0 {
ctx.airdrop_chain_and_delegate(
&payer_chain,
&keypair,
rent_exempt_amount,
)
.expect(
"failed to airdrop and delegate to on-chain account",
);
} else {
ctx.airdrop_chain(&keypair.pubkey(), rent_exempt_amount)
.expect("failed to airdrop to on-chain account");
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Include pubkey in task error messages for faster triage

Enrich the failure messages so we know which account failed and why.

-            join_set.spawn(async move {
-                if idx % 2 == 0 {
-                    ctx.airdrop_chain_and_delegate(
-                        &payer_chain,
-                        &keypair,
-                        rent_exempt_amount,
-                    )
-                    .expect(
-                        "failed to airdrop and delegate to on-chain account",
-                    );
-                } else {
-                    ctx.airdrop_chain(&keypair.pubkey(), rent_exempt_amount)
-                        .expect("failed to airdrop to on-chain account");
-                }
-            });
+            join_set.spawn(async move {
+                let pk = keypair.pubkey();
+                if idx % 2 == 0 {
+                    ctx.airdrop_chain_and_delegate(
+                        &payer_chain,
+                        &keypair,
+                        rent_exempt_amount,
+                    )
+                    .unwrap_or_else(|e| panic!("failed to airdrop+delegate {pk}: {e}"));
+                } else {
+                    ctx.airdrop_chain(&pk, rent_exempt_amount)
+                        .unwrap_or_else(|e| panic!("failed to airdrop {pk}: {e}"));
+                }
+            });

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +111 to +117
// Wait 1 second for metrics update
tokio::time::sleep(Duration::from_secs(5)).await;

let monitored_accounts_after =
ctx.get_monitored_accounts_count(PORT).unwrap();
let diff = monitored_accounts_after - monitored_accounts_before;
debug!("Monitored accounts count total: {monitored_accounts_after}, diff: {diff}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Avoid fixed sleeps; poll metrics with timeout to reduce flakiness

Replace the hard-coded sleep with a short poll + deadline.

-    // Wait 1 second for metrics update
-    tokio::time::sleep(Duration::from_secs(5)).await;
-
-    let monitored_accounts_after =
-        ctx.get_monitored_accounts_count(PORT).unwrap();
+    // Poll until metrics reflect expected increase or timeout (30s)
+    let monitored_accounts_after = {
+        let deadline = tokio::time::Instant::now() + Duration::from_secs(30);
+        loop {
+            let v = ctx.get_monitored_accounts_count(PORT).unwrap();
+            if v - monitored_accounts_before >= EXTRA_MONITORED_ACCOUNTS
+                || tokio::time::Instant::now() >= deadline
+            {
+                break v;
+            }
+            tokio::time::sleep(Duration::from_millis(250)).await;
+        }
+    };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Wait 1 second for metrics update
tokio::time::sleep(Duration::from_secs(5)).await;
let monitored_accounts_after =
ctx.get_monitored_accounts_count(PORT).unwrap();
let diff = monitored_accounts_after - monitored_accounts_before;
debug!("Monitored accounts count total: {monitored_accounts_after}, diff: {diff}");
// Poll until metrics reflect expected increase or timeout (30s)
let monitored_accounts_after = {
let deadline = tokio::time::Instant::now() + Duration::from_secs(30);
loop {
let v = ctx.get_monitored_accounts_count(PORT).unwrap();
if v - monitored_accounts_before >= EXTRA_MONITORED_ACCOUNTS
|| tokio::time::Instant::now() >= deadline
{
break v;
}
tokio::time::sleep(Duration::from_millis(250)).await;
}
};
let diff = monitored_accounts_after - monitored_accounts_before;
debug!("Monitored accounts count total: {monitored_accounts_after}, diff: {diff}");
🤖 Prompt for AI Agents
In test-integration/test-cloning/tests/07_subscription_limits.rs around lines
111 to 117, replace the fixed tokio::time::sleep(Duration::from_secs(5)).await
with a polling loop that repeatedly calls ctx.get_monitored_accounts_count(PORT)
until the expected change is observed or a deadline is reached; implement a
short poll interval (e.g., 100–200ms) and a total timeout (e.g., 2–5s) using
tokio::time::sleep inside the loop or tokio::time::timeout/Instant for the
deadline, and fail the test (return Err or panic) if the timeout elapses without
the expected metric change to avoid flakiness.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
test-integration/test-cloning/tests/07_subscription_limits.rs (1)

71-73: Propagate JoinError to detect task failures.

The loop iterates over join_all() results but ignores them with the _result binding. If a spawned task panics or is cancelled, the JoinError is silently dropped, allowing the test to proceed with incomplete operations and produce misleading results. The comment "handled by join_all" is inaccurate—join_all() collects errors but does not propagate them.

Apply this diff to fail the test immediately on any task failure:

-        for _result in join_set.join_all().await {
-            // spawned task panicked or was cancelled - handled by join_all
+        for result in join_set.join_all().await {
+            result.expect("airdrop/delegate task failed");
         }
magicblock-chainlink/src/remote_account_provider/mod.rs (1)

715-725: Restore the evicted key in LRU when unsubscribe fails

When unsubscribe(evicted) errors we roll back the new pubkey, but the evicted key stays removed from the cache even though the pubsub client never dropped it. That leaves is_watching(evicted) false while the subscription is still live and can trigger duplicate subscribe attempts later—this was flagged previously.

         if let Err(err) = self.pubsub_client.unsubscribe(evicted).await {
             warn!(
                 "Failed to unsubscribe from pubsub for evicted account {evicted}: {err:?}");
             // Rollback the LRU add since eviction failed
             self.lrucache_subscribed_accounts.remove(pubkey);
+            self.lrucache_subscribed_accounts.add(evicted);
             return Err(err);
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b284277 and 3bc011d.

📒 Files selected for processing (3)
  • magicblock-chainlink/src/chainlink/mod.rs (3 hunks)
  • magicblock-chainlink/src/remote_account_provider/mod.rs (24 hunks)
  • test-integration/test-cloning/tests/07_subscription_limits.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.425Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.

Applied to files:

  • magicblock-chainlink/src/chainlink/mod.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-chainlink/src/chainlink/mod.rs
  • test-integration/test-cloning/tests/07_subscription_limits.rs
  • magicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-11-07T14:20:31.425Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.425Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.

Applied to files:

  • magicblock-chainlink/src/chainlink/mod.rs
  • test-integration/test-cloning/tests/07_subscription_limits.rs
  • magicblock-chainlink/src/remote_account_provider/mod.rs
📚 Learning: 2025-10-14T09:56:14.047Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 564
File: test-integration/programs/flexi-counter/src/processor/call_handler.rs:122-125
Timestamp: 2025-10-14T09:56:14.047Z
Learning: The file test-integration/programs/flexi-counter/src/processor/call_handler.rs contains a test smart contract used for integration testing, not production code.

Applied to files:

  • test-integration/test-cloning/tests/07_subscription_limits.rs
📚 Learning: 2025-10-26T16:53:29.820Z
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 587
File: magicblock-chainlink/src/remote_account_provider/mod.rs:134-0
Timestamp: 2025-10-26T16:53:29.820Z
Learning: In magicblock-chainlink/src/remote_account_provider/mod.rs, the `Endpoint::separate_pubsub_url_and_api_key()` method uses `split_once("?api-key=")` because the api-key parameter is always the only query parameter right after `?`. No additional query parameter parsing is needed for this use case.

Applied to files:

  • magicblock-chainlink/src/remote_account_provider/mod.rs
🧬 Code graph analysis (2)
magicblock-chainlink/src/chainlink/mod.rs (1)
magicblock-chainlink/src/remote_account_provider/mod.rs (1)
  • pubkeys_str (1079-1085)
magicblock-chainlink/src/remote_account_provider/mod.rs (6)
magicblock-metrics/src/metrics/mod.rs (1)
  • set_monitored_accounts_count (320-322)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs (1)
  • new (126-158)
magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs (3)
  • new (46-54)
  • new (311-322)
  • try_new_from_url (165-180)
magicblock-chainlink/src/submux/mod.rs (1)
  • new (147-158)
magicblock-chainlink/src/remote_account_provider/lru_cache.rs (4)
  • new (25-34)
  • pubkeys (38-41)
  • pubkeys (129-135)
  • len (117-123)
magicblock-chainlink/src/remote_account_provider/config.rs (2)
  • default (65-72)
  • try_new_with_metrics (27-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build Project
🔇 Additional comments (4)
test-integration/test-cloning/tests/07_subscription_limits.rs (4)

1-17: LGTM! Clear setup and well-documented constants.

The imports are appropriate for an integration test, and the constants are well-documented. The math is correct: 400 accounts with half delegated results in 200 monitored accounts.


19-44: LGTM! Appropriate test configuration.

The test is correctly marked as ignored for manual execution, uses multi-threaded Tokio runtime for parallelism, and the payer funding (10 SOL) should be sufficient for the planned operations.


76-107: LGTM! Appropriate logging and account fetching.

The batch processing logs progress clearly, and the account fetching verifies that accounts are accessible via the ephemeral RPC. The trace-level pubkey logging is appropriately verbose for debugging without cluttering normal test output.


114-123: LGTM! Clear verification logic.

The final assertion correctly validates that the monitored accounts increased by exactly 200 (the non-delegated accounts), which is the core objective of this subscription limits test.

Comment on lines +2 to 8
collections::{HashMap, HashSet},
num::NonZeroUsize,
sync::{
atomic::{AtomicU64, Ordering},
Arc, Mutex,
},
time::Duration,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't drop existing fetch waiters

fetching.insert replaces any in-flight entry, dropping the earlier waiters so their receivers resolve with RecvError even though the fetch eventually succeeds. Concurrent try_get_multi callers will start failing under load. Please append to the existing waiter list instead of clobbering it.

-use std::{
-    collections::{HashMap, HashSet},
+use std::{
+    collections::{hash_map::Entry, HashMap, HashSet},
@@
-            for &pubkey in pubkeys {
-                let (sender, receiver) = oneshot::channel();
-                fetching.insert(pubkey, (fetch_start_slot, vec![sender]));
-                subscription_overrides.push((pubkey, receiver));
-            }
+            for &pubkey in pubkeys {
+                let (sender, receiver) = oneshot::channel();
+                match fetching.entry(pubkey) {
+                    Entry::Occupied(mut entry) => {
+                        entry.get_mut().1.push(sender);
+                    }
+                    Entry::Vacant(entry) => {
+                        entry.insert((fetch_start_slot, vec![sender]));
+                    }
+                }
+                subscription_overrides.push((pubkey, receiver));
+            }

Also applies to: 618-645

Comment on lines +35 to +36
// TODO: need to delegate half those instead as part of airdropping
// that way we can test unsub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove obsolete TODO comment.

The TODO suggests delegating half the accounts as part of airdropping, but this functionality is already implemented in lines 56-68 (even-indexed accounts are delegated via airdrop_chain_and_delegate, odd-indexed are not). The comment should be removed to avoid confusion.

Apply this diff:

-    // TODO: need to delegate half those instead as part of airdropping
-    // that way we can test unsub
     let rent_exempt_amount = Rent::default().minimum_balance(0);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// TODO: need to delegate half those instead as part of airdropping
// that way we can test unsub
let rent_exempt_amount = Rent::default().minimum_balance(0);
🤖 Prompt for AI Agents
In test-integration/test-cloning/tests/07_subscription_limits.rs around lines
35-36, remove the obsolete TODO comment "// TODO: need to delegate half those
instead as part of airdropping // that way we can test unsub" because the code
already delegates even-indexed accounts via airdrop_chain_and_delegate (lines
56-68); simply delete these two comment lines to avoid confusion.

Comment on lines +111 to +112
// Wait 1 second for metrics update
tokio::time::sleep(Duration::from_secs(5)).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix comment to match wait duration.

The comment states "Wait 1 second" but the code waits 5 seconds. Update the comment for consistency.

Apply this diff:

-    // Wait 1 second for metrics update
+    // Wait 5 seconds for metrics update
     tokio::time::sleep(Duration::from_secs(5)).await;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Wait 1 second for metrics update
tokio::time::sleep(Duration::from_secs(5)).await;
// Wait 5 seconds for metrics update
tokio::time::sleep(Duration::from_secs(5)).await;
🤖 Prompt for AI Agents
In test-integration/test-cloning/tests/07_subscription_limits.rs around lines
111 to 112, the inline comment incorrectly says "Wait 1 second" while the code
sleeps for 5 seconds; update the comment to accurately reflect the wait duration
(e.g., change to "Wait 5 seconds for metrics update") so the comment matches the
implementation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
magicblock-chainlink/src/chainlink/mod.rs (1)

175-181: Remove unnecessary fetch_sub operation.

The fetch_sub at lines 178-181 is dead code—its result is unused and the mutation is never observed since into_inner() is called on all atomics in the logging statement. The non_empty calculation at lines 175-177 is already correct and sufficient for logging.

Apply this diff to remove the unnecessary operation:

     let non_empty = remaining
         .load(Ordering::Relaxed)
         .saturating_sub(remaining_empty.load(Ordering::Relaxed));
-    remaining.fetch_sub(
-        remaining_empty.load(Ordering::Relaxed),
-        Ordering::Relaxed,
-    );
-
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3bc011d and 14ecd09.

📒 Files selected for processing (2)
  • magicblock-chainlink/src/chainlink/mod.rs (3 hunks)
  • magicblock-processor/src/executor/processing.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: thlorenz
Repo: magicblock-labs/magicblock-validator PR: 621
File: magicblock-chainlink/src/remote_account_provider/chain_pubsub_actor.rs:457-495
Timestamp: 2025-11-07T14:20:31.425Z
Learning: In magicblock-chainlink/src/remote_account_provider/chain_pubsub_client.rs, the unsubscribe closure returned by PubSubConnection::account_subscribe(...) resolves to () (unit), not a Result. Downstream code should not attempt to inspect an unsubscribe result and can optionally wrap it in a timeout to guard against hangs.
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.

Applied to files:

  • magicblock-processor/src/executor/processing.rs
  • magicblock-chainlink/src/chainlink/mod.rs
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579

Applied to files:

  • magicblock-processor/src/executor/processing.rs
  • magicblock-chainlink/src/chainlink/mod.rs
🧬 Code graph analysis (1)
magicblock-chainlink/src/chainlink/mod.rs (2)
magicblock-chainlink/src/chainlink/fetch_cloner.rs (1)
  • new (134-158)
magicblock-chainlink/src/remote_account_provider/mod.rs (2)
  • new (262-317)
  • pubkeys_str (1079-1085)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run_make_ci_test
  • GitHub Check: Build Project
🔇 Additional comments (6)
magicblock-chainlink/src/chainlink/mod.rs (4)

1-4: LGTM: Atomic imports added correctly.

The imports are appropriate for the atomic counter operations introduced in reset_accounts_bank.


143-147: LGTM: Atomic counters initialized correctly.

The atomic counters provide clear categorization for account removal tracking and improve observability.


149-173: LGTM: Account removal logic is well-structured.

The explicit branching for blacklisted, delegated, and DLP-owned accounts provides clear categorization. The atomic increments with Relaxed ordering are appropriate for these independent statistics counters.


183-194: LGTM: Logging calculation fixed correctly.

The logging now correctly uses the pre-calculated non_empty value, addressing the previous review concern about incorrect fetch_sub usage inline. The detailed breakdown provides excellent observability into account removal categories.

magicblock-processor/src/executor/processing.rs (2)

10-10: LGTM: Import is necessary and correct.

The ReadableAccount trait import is required for the lamports() method used in the filter condition at line 292.


287-294: Account closure handling is correctly delegated to chainlink.

The processor intentionally skips persisting empty accounts (lamports == 0) because chainlink manages account closures independently through direct pubsub subscriptions to the chain. When an account closes, chainlink receives that closure event directly from the chain (see listen_for_account_updates in magicblock-chainlink/src/remote_account_provider/mod.rs), not through the processor's accounts_tx channel. This separation of concerns is documented in the code comment at line 289-291 and is the correct design.

Likely an incorrect or invalid review comment.

Comment on lines +327 to 342
if log::log_enabled!(log::Level::Debug) {
let pubkeys_str = pubkeys
.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ");
trace!("Fetching accounts: {pubkeys_str}");
let mark_empty_str = mark_empty_if_not_found
.map(|keys| {
keys.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ")
})
.unwrap_or_default();
trace!("Fetching accounts: {pubkeys_str}, mark_empty_if_not_found: {mark_empty_str}");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix logging level inconsistency.

The code gates on log::Level::Debug (line 327) but then uses the trace!() macro (line 341). This is inconsistent—if Debug is enabled but Trace is not, the expensive string formatting will execute but produce no output.

Apply this diff to align the macro with the gate:

-        if log::log_enabled!(log::Level::Debug) {
+        if log::log_enabled!(log::Level::Trace) {
             let pubkeys_str = pubkeys
                 .iter()
                 .map(|p| p.to_string())

Or, if Debug-level logging is intentional, use the debug!() macro instead:

             let mark_empty_str = mark_empty_if_not_found
                 .map(|keys| {
                     keys.iter()
                         .map(|p| p.to_string())
                         .collect::<Vec<_>>()
                         .join(", ")
                 })
                 .unwrap_or_default();
-            trace!("Fetching accounts: {pubkeys_str}, mark_empty_if_not_found: {mark_empty_str}");
+            debug!("Fetching accounts: {pubkeys_str}, mark_empty_if_not_found: {mark_empty_str}");
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if log::log_enabled!(log::Level::Debug) {
let pubkeys_str = pubkeys
.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ");
trace!("Fetching accounts: {pubkeys_str}");
let mark_empty_str = mark_empty_if_not_found
.map(|keys| {
keys.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ")
})
.unwrap_or_default();
trace!("Fetching accounts: {pubkeys_str}, mark_empty_if_not_found: {mark_empty_str}");
}
if log::log_enabled!(log::Level::Debug) {
let pubkeys_str = pubkeys
.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ");
let mark_empty_str = mark_empty_if_not_found
.map(|keys| {
keys.iter()
.map(|p| p.to_string())
.collect::<Vec<_>>()
.join(", ")
})
.unwrap_or_default();
debug!("Fetching accounts: {pubkeys_str}, mark_empty_if_not_found: {mark_empty_str}");
}
🤖 Prompt for AI Agents
In magicblock-chainlink/src/chainlink/mod.rs around lines 327 to 342, the code
checks log::Level::Debug but then calls trace!(), causing expensive string
formatting to run when Debug is enabled but Trace is not; fix by making the gate
and macro consistent: either change the guard to log::Level::Trace (or use
log::log_enabled!(log::Level::Trace)) if you want trace-level logging, or
replace trace!() with debug!() if debug-level output was intended, ensuring the
enabled-level check matches the chosen logging macro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants