-
Notifications
You must be signed in to change notification settings - Fork 416
Integrate Splicing with Quiescence #4019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Integrate Splicing with Quiescence #4019
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4019 +/- ##
==========================================
- Coverage 88.76% 88.76% -0.01%
==========================================
Files 176 176
Lines 128139 128185 +46
Branches 128139 128185 +46
==========================================
+ Hits 113743 113777 +34
- Misses 11822 11828 +6
- Partials 2574 2580 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
be3b93a
to
ddb3b37
Compare
lightning/src/ln/channel.rs
Outdated
let tx16 = TransactionU16LenLimited::new(tx) | ||
.map_err(|_e| APIError::APIMisuseError { err: format!("Too large transaction") })?; | ||
|
||
// TODO(splicing): Check that transactions aren't too big for the splice_init message here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably just want to do all input validation here before we even attempt quiescence so that we can fail early and fast. No point in going through the whole stfu
dance just to immediately fail back to the user with "insufficient fees" or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's definitely the goal. Kinda leaving this until the parameters are clear and I'll clean it up then.
👋 Thanks for assigning @jkczyz as a reviewer! |
When doing an outbound splice, we check some of the instructions first in utility methods, then convert errors to `APIError`s. These utility methods should thus either return an `APIError` or more generic (string or `()`) error type, but they currently return a `ChannelError`, which is only approprite when the calling code will do what the `ChannelError` instructs (including closing the channel). Here we fix that by returning `String`s instead.
Now that we have a `QuiescentAction` to track what we intend to do once we reach quiescence, we need to use it to initiate splices. Here we do so, adding a new `SpliceInstructions` to track the arguments that are currently passed to `splice_channel`. While these may not be exactly the right arguments to track in the end, a lot of the splice logic is still in flight, so we can worry about it later.
While we have a test to disconnect a peer if we're waiting on an `stfu` message, we also disconnect if we've reached quiescence but we're waiting on a peer to do "something fundamental" and they take too long to do so. We test that behavior here.
ddb3b37
to
d9a278c
Compare
nodes[1].node.timer_tick_occurred(); | ||
} | ||
|
||
// nodes[1] didn't receive nodes[0]'s stfu within the timeout so it'll disconnect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use node A and node B if doing so below
self.propose_quiescence(logger, action) | ||
.map_err(|e| APIError::APIMisuseError { err: e.to_owned() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to break the check portion out of propose_quiescence
such that it can be checked earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, also if there's a splice already queued up waiting on quiescence, the user will get back a "Channel is already quiescing" error which isn't very clear.
@@ -11857,7 +11931,7 @@ where | |||
|
|||
// Assumes we are either awaiting quiescence or our counterparty has requested quiescence. | |||
#[rustfmt::skip] | |||
pub fn send_stfu<L: Deref>(&mut self, logger: &L) -> Result<msgs::Stfu, ChannelError> | |||
pub fn send_stfu<L: Deref>(&mut self, logger: &L) -> Result<msgs::Stfu, &'static str> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May as well return String
if we'll just need to call to_owned
.
@@ -11152,6 +11152,11 @@ where | |||
})?; | |||
debug_assert!(interactive_tx_constructor.take_initiator_first_message().is_none()); | |||
|
|||
// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/post_quiescence_action/quiescent_action/
// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do | ||
// into the counterparty-initiated splice. For always-on nodes this probably isn't a useful | ||
// optimization, but for often-offline nodes it may be, as we may connect and immediately | ||
// go into splicing from both sides. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this happen when both counterparties initiated quiescence simultaneously? Ties should go to the channel opener. Would we lose the user-initiated quiescent_action
in that case? Or rather would we possibly be stuck when the user tries to initiate the splice again since quiescent_action
is Some
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we lose the tie, we don't seem to consume the quiescent_action
, but we also don't seem to attempt quiescence again once we exit so that we can perform ours. We only seem to retry quiescence with a pending quescient_action
after a reconnection.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
@@ -10876,8 +10911,49 @@ where | |||
} | |||
} | |||
|
|||
let prev_funding_input = self.funding.to_splice_funding_input(); | |||
let original_funding_txo = self.funding.get_funding_txo().ok_or_else(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could just expect this, we checked for liveness above
self.propose_quiescence(logger, action) | ||
.map_err(|e| APIError::APIMisuseError { err: e.to_owned() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, also if there's a splice already queued up waiting on quiescence, the user will get back a "Channel is already quiescing" error which isn't very clear.
if self.funding.get_funding_txo() != Some(original_funding_txo) { | ||
// This should be unreachable once we opportunistically merge splices if the | ||
// counterparty initializes a splice. | ||
return Err("Funding changed out from under us".to_owned()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is preventing a user-initiated splice from happening if the counterparty happens to splice first. Why do we want this? Presumably, if they spliced, they still want to splice-in/out funds from their side of the channel and that hasn't happened yet.
#[cfg(splicing)] | ||
return self.send_splice_init(_instructions) | ||
.map(|splice_init| Some(StfuResponse::SpliceInit(splice_init))) | ||
.map_err(|e| ChannelError::Ignore(e.to_owned())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well disconnect?
@@ -11109,6 +11109,10 @@ where | |||
ES::Target: EntropySource, | |||
L::Target: Logger, | |||
{ | |||
if !self.context.channel_state.is_quiescent() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also check the counterparty is the quiescence initiator? I guess once we can contribute back it doesn't matter, might just be more relevant when there's another quiescent action that's possible to perform.
// TODO(splicing): if post_quiescence_action is set, integrate what the user wants to do | ||
// into the counterparty-initiated splice. For always-on nodes this probably isn't a useful | ||
// optimization, but for often-offline nodes it may be, as we may connect and immediately | ||
// go into splicing from both sides. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we lose the tie, we don't seem to consume the quiescent_action
, but we also don't seem to attempt quiescence again once we exit so that we can perform ours. We only seem to retry quiescence with a pending quescient_action
after a reconnection.
Based on #4007this does the actual integration. Will probably go after #3979 so drafting for now.