Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Aug 18, 2025

Based on #4007 this does the actual integration. Will probably go after #3979 so drafting for now.

Copy link

codecov bot commented Aug 18, 2025

Codecov Report

❌ Patch coverage is 65.15152% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.76%. Comparing base (a18f362) to head (d9a278c).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 53.33% 7 Missing ⚠️
lightning/src/ln/channelmanager.rs 61.11% 6 Missing and 1 partial ⚠️
lightning/src/util/ser.rs 0.00% 7 Missing ⚠️
lightning/src/ln/quiescence_tests.rs 92.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4019      +/-   ##
==========================================
- Coverage   88.76%   88.76%   -0.01%     
==========================================
  Files         176      176              
  Lines      128139   128185      +46     
  Branches   128139   128185      +46     
==========================================
+ Hits       113743   113777      +34     
- Misses      11822    11828       +6     
- Partials     2574     2580       +6     
Flag Coverage Δ
fuzzing 21.90% <2.50%> (-0.02%) ⬇️
tests 88.59% <65.15%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@TheBlueMatt TheBlueMatt force-pushed the 2025-08-splice-quiescent-real branch from be3b93a to ddb3b37 Compare August 20, 2025 17:25
let tx16 = TransactionU16LenLimited::new(tx)
.map_err(|_e| APIError::APIMisuseError { err: format!("Too large transaction") })?;

// TODO(splicing): Check that transactions aren't too big for the splice_init message here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably just want to do all input validation here before we even attempt quiescence so that we can fail early and fast. No point in going through the whole stfu dance just to immediately fail back to the user with "insufficient fees" or something similar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's definitely the goal. Kinda leaving this until the parameters are clear and I'll clean it up then.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Aug 21, 2025

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

When doing an outbound splice, we check some of the instructions
first in utility methods, then convert errors to `APIError`s. These
utility methods should thus either return an `APIError` or more
generic (string or `()`) error type, but they currently return a
`ChannelError`, which is only approprite when the calling code will
do what the `ChannelError` instructs (including closing the
channel).

Here we fix that by returning `String`s instead.
Now that we have a `QuiescentAction` to track what we intend to do
once we reach quiescence, we need to use it to initiate splices.

Here we do so, adding a new `SpliceInstructions` to track the
arguments that are currently passed to `splice_channel`. While
these may not be exactly the right arguments to track in the end,
a lot of the splice logic is still in flight, so we can worry about
it later.
While we have a test to disconnect a peer if we're waiting on an
`stfu` message, we also disconnect if we've reached quiescence but
we're waiting on a peer to do "something fundamental" and they take
too long to do so. We test that behavior here.
@TheBlueMatt TheBlueMatt force-pushed the 2025-08-splice-quiescent-real branch from ddb3b37 to d9a278c Compare August 26, 2025 16:36
@TheBlueMatt TheBlueMatt marked this pull request as ready for review August 26, 2025 16:36
@wpaulino wpaulino requested review from jkczyz and removed request for valentinewallace August 26, 2025 16:47
nodes[1].node.timer_tick_occurred();
}

// nodes[1] didn't receive nodes[0]'s stfu within the timeout so it'll disconnect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use node A and node B if doing so below

Comment on lines +10932 to +10933
self.propose_quiescence(logger, action)
.map_err(|e| APIError::APIMisuseError { err: e.to_owned() })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to break the check portion out of propose_quiescence such that it can be checked earlier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, also if there's a splice already queued up waiting on quiescence, the user will get back a "Channel is already quiescing" error which isn't very clear.

@@ -11857,7 +11931,7 @@ where

// Assumes we are either awaiting quiescence or our counterparty has requested quiescence.
#[rustfmt::skip]
pub fn send_stfu<L: Deref>(&mut self, logger: &L) -> Result<msgs::Stfu, ChannelError>
pub fn send_stfu<L: Deref>(&mut self, logger: &L) -> Result<msgs::Stfu, &'static str>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well return String if we'll just need to call to_owned.

@@ -11152,6 +11152,11 @@ where
})?;
debug_assert!(interactive_tx_constructor.take_initiator_first_message().is_none());

// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/post_quiescence_action/quiescent_action/

Comment on lines +11155 to +11158
// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do
// into the counterparty-initiated splice. For always-on nodes this probably isn't a useful
// optimization, but for often-offline nodes it may be, as we may connect and immediately
// go into splicing from both sides.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this happen when both counterparties initiated quiescence simultaneously? Ties should go to the channel opener. Would we lose the user-initiated quiescent_action in that case? Or rather would we possibly be stuck when the user tries to initiate the splice again since quiescent_action is Some?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we lose the tie, we don't seem to consume the quiescent_action, but we also don't seem to attempt quiescence again once we exit so that we can perform ours. We only seem to retry quiescence with a pending quescient_action after a reconnection.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@@ -10876,8 +10911,49 @@ where
}
}

let prev_funding_input = self.funding.to_splice_funding_input();
let original_funding_txo = self.funding.get_funding_txo().ok_or_else(|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could just expect this, we checked for liveness above

Comment on lines +10932 to +10933
self.propose_quiescence(logger, action)
.map_err(|e| APIError::APIMisuseError { err: e.to_owned() })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, also if there's a splice already queued up waiting on quiescence, the user will get back a "Channel is already quiescing" error which isn't very clear.

Comment on lines +10950 to +10953
if self.funding.get_funding_txo() != Some(original_funding_txo) {
// This should be unreachable once we opportunistically merge splices if the
// counterparty initializes a splice.
return Err("Funding changed out from under us".to_owned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is preventing a user-initiated splice from happening if the counterparty happens to splice first. Why do we want this? Presumably, if they spliced, they still want to splice-in/out funds from their side of the channel and that hasn't happened yet.

#[cfg(splicing)]
return self.send_splice_init(_instructions)
.map(|splice_init| Some(StfuResponse::SpliceInit(splice_init)))
.map_err(|e| ChannelError::Ignore(e.to_owned()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well disconnect?

@@ -11109,6 +11109,10 @@ where
ES::Target: EntropySource,
L::Target: Logger,
{
if !self.context.channel_state.is_quiescent() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check the counterparty is the quiescence initiator? I guess once we can contribute back it doesn't matter, might just be more relevant when there's another quiescent action that's possible to perform.

Comment on lines +11155 to +11158
// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do
// into the counterparty-initiated splice. For always-on nodes this probably isn't a useful
// optimization, but for often-offline nodes it may be, as we may connect and immediately
// go into splicing from both sides.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we lose the tie, we don't seem to consume the quiescent_action, but we also don't seem to attempt quiescence again once we exit so that we can perform ours. We only seem to retry quiescence with a pending quescient_action after a reconnection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants