Skip to content

Persist external funding state across restarts#1252

Open
joii2020 wants to merge 6 commits intonervosnetwork:developfrom
joii2020:dev.ext-funding-store
Open

Persist external funding state across restarts#1252
joii2020 wants to merge 6 commits intonervosnetwork:developfrom
joii2020:dev.ext-funding-store

Conversation

@joii2020
Copy link
Copy Markdown
Contributor

@joii2020 joii2020 commented Apr 7, 2026

Summary

This PR fixes a restart-recovery issue in the external funding flow.

In the fiber-wasm scenario, cross-origin restrictions, especially on many mobile browsers, mean JoyID signing often has to be completed through a redirect flow. In practice, this can cause fiber-wasm to restart, which previously caused the runtime state created by open_channel_with_external_funding to be lost. As a result, the subsequent submit_signed_funding_tx call would fail.

Changes

This PR adds persisted recovery state for the external funding flow so the following scenarios can continue correctly after a restart:

  • submit_signed_funding_tx can still be executed after restart
  • the post-submission channel handshake can resume after restart
  • external funding timeout behavior still applies after restart
  • persisted external funding recovery state is cleared once the flow reaches a terminal path

Specifically, this PR:

  • adds a dedicated persisted recovery state for external funding
  • restores external-funding-related runtime state from the store during reload / reestablish
  • allows submit_signed_funding_tx to recover the missing channel state before continuing the flow
  • updates the store schema and related store tests
  • adds restart-focused tests and an e2e restart script for the external funding flow

Test Coverage

This PR adds or updates coverage for:

  • submitting submit_signed_funding_tx after restart
  • resuming the external funding handshake after initiator restart
  • resuming the external funding handshake after acceptor restart
  • ensuring external funding timeout still aborts correctly after restart
  • ensuring external funding recovery state is cleared after the channel reaches a terminal path
  • adding an e2e test script at tests/bruno/e2e/external-funding-open/run-restart-test.sh

The restart behavior is added on top of the existing Bruno e2e flow. Since node restart is difficult to express cleanly in the Bruno workflow itself, this part is covered by a dedicated script.

Notes

This PR is focused on restart recovery for the external funding flow and does not change RPC semantics.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses restart-recovery gaps in the external funding channel-open flow by persisting enough state to allow submit_signed_funding_tx and the subsequent handshake to resume after a node restart (notably important for fiber-wasm redirect-based signing flows).

Changes:

  • Add a persisted ExternalFundingRecoveryState (new store key prefix + store API) and hydrate external-funding runtime state on reload/reestablish.
  • Update channel/network logic to persist/move channel state appropriately and to restore/reestablish channels when submit_signed_funding_tx happens after restart.
  • Add/extend store + restart-focused unit tests and an e2e restart regression script/docs.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/bruno/e2e/external-funding-open/run-restart-test.sh Adds a restart regression script that restarts node1 between signing and submit_signed_funding_tx.
tests/bruno/e2e/external-funding-open/README.md Documents the restart regression script and how to run it.
crates/fiber-types/src/schema.rs Adds a new store key prefix for external funding recovery state.
crates/fiber-types/src/channel.rs Introduces ExternalFundingRecoveryState type for persisted recovery data.
crates/fiber-lib/src/store/tests/store.rs Adds store-level persistence tests for external funding recovery state.
crates/fiber-lib/src/store/store_impl/mod.rs Implements persistence for ExternalFundingRecoveryState and adds move_channel_actor_state.
crates/fiber-lib/src/store/.schema.json Updates schema hash for the new persisted type.
crates/fiber-lib/src/fiber/tests/settle_tlc_set_command_tests.rs Updates mock store to satisfy the extended ChannelActorStateStore trait.
crates/fiber-lib/src/fiber/tests/channel.rs Adds restart-focused tests for external funding submit/handshake/timeout/cleanup behavior.
crates/fiber-lib/src/fiber/network.rs Allows submit_signed_funding_tx to restore/reestablish missing channel state after restart.
crates/fiber-lib/src/fiber/channel.rs Persists/hydrates external funding recovery state and clears it on terminal paths; moves acceptor state under final channel id for restart routing.

Comment thread tests/bruno/e2e/external-funding-open/README.md
Comment thread tests/bruno/e2e/external-funding-open/README.md Outdated
Comment thread tests/bruno/e2e/external-funding-open/run-restart-test.sh
@joii2020 joii2020 force-pushed the dev.ext-funding-store branch 2 times, most recently from c4f0980 to 28e9d89 Compare April 8, 2026 12:44
@quake
Copy link
Copy Markdown
Member

quake commented Apr 10, 2026

Instead of introducing a new store prefix and ExternalFundingRecoveryState with separate get/insert/delete methods, we can add a single Option field to the existing ChannelActorData:

#[serde_as]
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct ExternalFundingPersistState {
    #[serde_as(as = "EntityHex")]
    pub funding_lock_script: Script,
    #[serde_as(as = "Vec<EntityHex>")]
    pub funding_lock_script_cell_deps: Vec<packed::CellDep>,
    #[serde_as(as = "EntityHex")]
    pub unsigned_funding_tx: Transaction,
    pub started_at_ms: u64,
    pub signed_submitted: bool,
}

Append to ChannelActorData:

#[serde(default)]
pub external_funding: Option<ExternalFundingPersistState>,

When an external funding flow begins, populate external_funding = Some(...) and persist via the normal insert_channel_actor_state path. On deserialization (restart/reload), rebuild ephemeral_config.external_funding from the persisted core.external_funding field. When the channel reaches ChannelReady or Closed, set it back to None.

This eliminates: No new store key prefix or schema change, no new ExternalFundingRecoveryState type with separate store operations, no new trait methods on ChannelActorStateStore (no mock/wasm changes needed). And the should_persist_channel_state special case can be removed, channel state is always persisted normally.

Migration is simple: append 1 byte (0u8, the bincode encoding of Option::None) to existing serialized ChannelActorState entries.

@joii2020 joii2020 force-pushed the dev.ext-funding-store branch from 8b0a540 to 27f8785 Compare April 14, 2026 03:29
Comment thread crates/fiber-lib/src/fiber/channel.rs
Comment thread crates/fiber-lib/src/fiber/channel.rs
Joii added 4 commits April 17, 2026 11:13
Store pending external funding state separately from the channel runtime so submit_signed_funding_tx can recover after node restart without any RPC changes.

Hydrate the runtime from store when reloading or reestablishing a channel, clear the persisted state on terminal paths, and teach the network actor to restore a missing channel before forwarding SubmitSignedFundingTx.

Add store coverage and restart-focused external funding tests, and update the store schema for the new persisted record.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.


### Restart regression script

[`run-restart-test.sh`](tests/bruno/e2e/external-funding-open/run-restart-test.sh) is a restart regression test for the external funding flow. It specifically verifies that after node1 has already:
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown link to run-restart-test.sh uses a path that is already relative to this README’s directory, so it will resolve incorrectly on GitHub (it effectively duplicates tests/bruno/e2e/external-funding-open/ in the URL). Update it to a path relative to this README (e.g., ./run-restart-test.sh).

Suggested change
[`run-restart-test.sh`](tests/bruno/e2e/external-funding-open/run-restart-test.sh) is a restart regression test for the external funding flow. It specifically verifies that after node1 has already:
[`run-restart-test.sh`](./run-restart-test.sh) is a restart regression test for the external funding flow. It specifically verifies that after node1 has already:

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using this path here would be more intuitive.

Comment thread crates/fiber-lib/src/fiber/channel.rs Outdated
Comment thread migrate/src/migrations/mig_20260303_external_funding_field.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants