fix: resolve payment on RemoveTlc(Fulfill) for force-closed channels (#1222)#1254
fix: resolve payment on RemoveTlc(Fulfill) for force-closed channels (#1222)#1254doitian wants to merge 4 commits intonervosnetwork:developfrom
Conversation
|
The current test has passed, but whether the results of list_channel and get_invoice need to be synchronizedly modified ? lnd0->lnd1->fiber1->fiber2
|
When a channel is force-closed the channel actor stops, so any RemoveTlc(Fulfill) arriving from the peer is silently dropped. This leaves the sender's payment session stuck in Inflight forever. Handle this in handle_peer_message: when a RemoveTlc(Fulfill) arrives for a channel whose actor is gone, look up the TLC from persisted state. If the TLC has a forwarding_tlc (intermediate node), relay the fulfillment upstream. Otherwise (payment sender), emit TlcRemoveReceived so the payment actor can finalise. Also emit StoreChange::PutPreimage from insert_watch_preimage so watchtower preimage discoveries are observable by store watchers. Made-with: Cursor
Add two tests that verify payment session resolution after a force close when the payee settles the TLC by revealing the preimage: - test_one_hop_payment_payee_settles_onchain: A force-closes, B settles invoice, A's payment should reach Success. - test_two_hop_payment_payee_settles_onchain: B force-closes B→C, C settles invoice, the fulfill relays through B back to A. Made-with: Cursor
Bruno e2e scenario: A→B→C payment where B force-closes the B→C channel and C settles the TLC on-chain with the preimage. Verifies that A's payment status reaches Success and B's channel balance reflects the settled amount. Made-with: Cursor
…lement After a force-closed channel's TLC is settled on-chain: - Payee: CheckChannels now scans received TLCs with LocalRemoved + Fulfill reason on both ChannelReady and Closed(UNCOOPERATIVE) channels and updates the invoice status to Paid. - Payer: the RemoveTlc(Fulfill) handler for force-closed channels now calls set_offered_tlc_removed() to transition the TLC to RemoteRemoved and persists the updated channel state. - Tests: added invoice Paid and TLC RemoteRemoved assertions to both one-hop and two-hop on-chain settlement tests.
4fc682d to
85ec855
Compare
| ) | ||
| }) | ||
| { | ||
| self.store |
There was a problem hiding this comment.
TODO: check the invoice is fully paid.
There was a problem hiding this comment.
Pull request overview
Fixes a long-standing edge case in Fiber’s payment lifecycle where a peer’s RemoveTlc(Fulfill) can arrive after a channel has been force-closed and the channel actor is already gone, leaving the sender (and downstream CCH tracking) stuck in Inflight. The PR adds network-level handling to recover fulfillment from persisted channel state, plus tests to validate one-hop and two-hop on-chain settlement flows and a new watchtower Bruno e2e scenario.
Changes:
- Intercept
RemoveTlc(Fulfill)for force-closed channels inNetworkActor::handle_peer_message, recover TLC info from persisted channel state, and propagate fulfillment upstream / to the payment actor. - Ensure invoices are marked
Paidwhen a fulfilled received-TLC is locally removed but the commitment round-trip never completes (common around force-closes). - Add unit tests and register a new Bruno e2e watchtower scenario in CI.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
crates/fiber-lib/src/fiber/network.rs |
Adds fallback handling for fulfill removes on closed channels and additional invoice-status reconciliation in periodic channel checks. |
crates/fiber-lib/src/store/store_impl/mod.rs |
Emits StoreChange::PutPreimage when inserting watchtower preimages so downstream watchers (e.g. CCH) can react. |
crates/fiber-lib/src/fiber/tests/payment.rs |
Adds one-hop and two-hop unit tests covering on-chain settlement after force-close. |
.github/workflows/e2e.yml |
Registers the new Bruno e2e scenario in CI. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/01-node1-connect-node2.bru |
New e2e step: connect Node1↔Node2. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/02-node2-connect-node3.bru |
New e2e step: connect Node2↔Node3. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/03-node1-node2-open-channel.bru |
New e2e step: open Node1→Node2 channel. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/04-node2-get-auto-accepted-channel.bru |
New e2e step: discover N1–N2 channel id. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/05-ckb-generate-blocks.bru |
New e2e step: mine epochs for N1–N2 channel confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/06-node2-node3-open-channel.bru |
New e2e step: open Node2→Node3 channel. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/07-node3-get-auto-accepted-channel.bru |
New e2e step: discover N2–N3 channel id. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/08-ckb-generate-blocks.bru |
New e2e step: mine epochs for N2–N3 channel confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/09-get-node1-funding-script.bru |
New e2e step: read Node1 funding lock script. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/10-get-node3-funding-script.bru |
New e2e step: read Node3 funding lock script. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/11-get-node1-balance.bru |
New e2e step: snapshot Node1 chain balance. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/12-get-node3-balance.bru |
New e2e step: snapshot Node3 chain balance. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/13-node3-gen-invoice.bru |
New e2e step: create hold invoice / preimage pair for the scenario. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/14-node1-send-payment-with-invoice.bru |
New e2e step: Node1 initiates payment via invoice. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/15-node2-force-close-channel.bru |
New e2e step: Node2 force-closes N2–N3 to force on-chain resolution path. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/16-node2-disconnect-node3.bru |
New e2e step: disconnect Node2 from Node3. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/17-ckb-generate-blocks-for-force-close-tx.bru |
New e2e step: mine epochs for force-close tx confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/18-node3-remove-tlc.bru |
New e2e step: Node3 removes TLC with preimage (fulfill). |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/19-ckb-generate-blocks-for-settlement-tx-preimage.bru |
New e2e step: mine epochs for settlement tx/preimage discovery. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/20-ckb-generate-blocks-for-final-settlement-tx.bru |
New e2e step: mine epochs to reach final settlement. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/21-ckb-generate-blocks-for-final-settlement-tx-commit.bru |
New e2e step: mine epochs to commit final settlement. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/22-check-channel1-balance.bru |
New e2e assertion: N1–N2 channel balance reflects claimed amount. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/23-check-payment-status.bru |
New e2e assertion: payment status becomes Success. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/24-disconnect.bru |
New e2e teardown: disconnect peers. |
| if let Some(mut actor_state) = | ||
| state.store.get_channel_actor_state(&channel_id) | ||
| { | ||
| if let Some(tlc) = |
There was a problem hiding this comment.
In this fallback path (when found is false), we accept and process RemoveTlc(Fulfill) purely based on a persisted channel_id. This should be gated to the force-closed case (e.g. actor_state.is_closed()) and also verify the persisted channel’s remote_pubkey matches the peer_pubkey on the incoming session; otherwise a reconnect race (channel still open but not yet reestablished) or an unrelated peer could trigger fulfillment handling for a channel it doesn’t belong to.
| if let Some(tlc) = | |
| if !actor_state.is_closed() { | |
| warn!( | |
| "Ignoring RemoveTlc(Fulfill) fallback for non-closed channel {:?}", | |
| channel_id | |
| ); | |
| } else if actor_state.remote_pubkey != peer_pubkey { | |
| warn!( | |
| "Ignoring RemoveTlc(Fulfill) fallback for channel {:?}: \ | |
| persisted remote pubkey {:?} does not match peer {:?}", | |
| channel_id, actor_state.remote_pubkey, peer_pubkey | |
| ); | |
| } else if let Some(tlc) = |
| if let Some(tlc) = | ||
| actor_state.get_offered_tlc(TLCId::Offered(remove_tlc.tlc_id)) | ||
| { | ||
| let payment_hash = tlc.payment_hash; | ||
| let forwarding_tlc = tlc.forwarding_tlc; | ||
| let attempt_id = tlc.attempt_id; | ||
|
|
||
| state | ||
| .store | ||
| .insert_preimage(payment_hash, fulfill.payment_preimage); | ||
| // Update TLC status to RemoteRemoved in | ||
| // persisted channel state. | ||
| actor_state.tlc_state.set_offered_tlc_removed( | ||
| remove_tlc.tlc_id, | ||
| remove_tlc.reason.clone(), | ||
| ); | ||
| state.store.insert_channel_actor_state(actor_state); |
There was a problem hiding this comment.
This fulfillment handling bypasses the normal check_remove_tlc_with_reason validation done by the channel actor. Before inserting the preimage / emitting TlcRemoveReceived, re-validate that hash_algorithm.hash(fulfill.payment_preimage) matches the TLC’s payment_hash. Also guard set_offered_tlc_removed(...): it asserts the TLC is in OutboundTlcStatus::Committed, so duplicates or out-of-order removes can currently panic the network actor here (bringing the node down).
| node_1 | ||
| .settle_invoice(&payment_hash, preimage) | ||
| .await | ||
| .unwrap(); | ||
| tokio::time::sleep(tokio::time::Duration::from_millis(3000)).await; | ||
|
|
||
| let status = node_0.get_payment_status(payment_hash).await; | ||
| info!("payer payment status after on-chain settlement: {status:?}"); | ||
| assert_eq!( | ||
| status, | ||
| PaymentStatus::Success, | ||
| "payer should see Success after payee settles TLC on-chain", | ||
| ); |
There was a problem hiding this comment.
These tests rely on fixed sleeps (e.g. 3s) before asserting payment success. This is prone to CI flakiness under load. Prefer polling with a timeout (similar to the earlier invoice-status loop), or use an existing helper like wait_until_success(...) (if it applies here) so the test waits “as long as needed up to N seconds” rather than “exactly 3 seconds”.
| // C settles the invoice (preimage revealed on-chain). | ||
| node_c | ||
| .settle_invoice(&payment_hash, preimage) | ||
| .await | ||
| .unwrap(); | ||
| tokio::time::sleep(tokio::time::Duration::from_millis(3000)).await; | ||
|
|
||
| let status = node_a.get_payment_status(payment_hash).await; | ||
| info!("payer (A) payment status after on-chain settlement: {status:?}"); | ||
| assert_eq!( | ||
| status, | ||
| PaymentStatus::Success, | ||
| "A should see Success after C settles TLC on-chain", | ||
| ); |
There was a problem hiding this comment.
Same as above: this test uses a fixed 3s sleep before asserting PaymentStatus::Success. Converting this to a poll-with-timeout (or a helper wait) will reduce flakiness and make failures easier to diagnose when settlement takes longer than expected.
Closes #1222
Summary
RemoveTlc(Fulfill)from the peer is silently dropped, leaving the sender's payment session stuck inInflightforever. Fixhandle_peer_messageto intercept these messages: look up the TLC from persisted channel state, relay the fulfillment upstream for intermediate nodes, or emitTlcRemoveReceivedfor the payment sender.force-close-preimage-settled-by-recipientBruno e2e test and register it in CI.Test plan
cargo nextest run -p fnn -p fiber-bin— 767 tests pass, 0 failurestest_one_hop_payment_payee_settles_onchainpassestest_two_hop_payment_payee_settles_onchainpasseswatchtower/force-close-preimage-settled-by-recipientpassesPaidRemoteRemoved