-
Notifications
You must be signed in to change notification settings - Fork 2.2k
multi: add BuildOnion, SendOnion, and TrackOnion RPCs #9489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi: add BuildOnion, SendOnion, and TrackOnion RPCs #9489
Conversation
|
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
bitromortac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work 🎉, this will be very useful! I've only started to look at the design and first commits, just leaving a few thoughts, but will continue to review. I think making SendOnion idempotent and repeatable is the safest option to lead to a TrackOnion endpoint that can be called at any time, to make client restarts simple (but only a preliminary conclusion). Is there an example somewhere of the switch RPC being consumed in a ChannelRouter (implementing retries)?
lnrpc/switchrpc/switch.proto
Outdated
|
|
||
| // The attempt ID uniquely identifying this payment attempt. The caller can | ||
| // expect to track results for the payment via this attempt ID. | ||
| uint64 attempt_id = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we take the ephemeral key in the onion to track the onion uniquely instead of attempt id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You present an interesting consideration 🤔 At a high level, it makes sense that some kind of ID must be used to allow the clients of a "send" RPC or function (whether payment, HTLC, or onion) to follow up with the result. For payments, this is payment_hash. For HTLCs, this has so far been a single uint64 sequence # style counter called attempt_id. The practice of using attempt_id here for onions, is from me following that pattern and from the information currently available to implementations of this interface:
type PaymentAttemptDispatcher interface {
// SendHTLC is a function that directs a link-layer switch to
// forward a fully encoded payment to the first hop in the route
// denoted by its public key. A non-nil error is to be returned if the
// payment was unsuccessful.
SendHTLC(firstHop lnwire.ShortChannelID,
attemptID uint64,
htlcAdd *lnwire.UpdateAddHTLC) error
// GetAttemptResult returns the result of the payment attempt with
// the given attemptID. The paymentHash should be set to the payment's
// overall hash, or in case of AMP payments the payment's unique
// identifier.
GetAttemptResult(attemptID uint64, paymentHash lntypes.Hash,
deobfuscator htlcswitch.ErrorDecrypter) (
<-chan *htlcswitch.PaymentResult, error)The only current code that builds onions that I know about and which could submit onions via this endpoint is the ChannelRouter type so the current RPC protobuf message fields were structured so as to help any potential re-user of the ChannelRouter type. It is possible that ephemeral onion key makes more sense to use generally as an tracking ID here though since it is a better fingerprint or more tightly bound to the onion itself.
I have started to be of the mind that lnd itself may need to change the way it handles attempt IDs a bit so as to facilitate multiple, independent users of a SendOnion style endpoint. You could imagine each RPC client generating its own attempt IDs - there would be the possibility of collision within the network result store used by the Switch.
lnrpc/switchrpc/switch_server.go
Outdated
| // NOTE(calvin): We'll either need to require clients provide the short | ||
| // channel ID to use as a first hop OR lookup an acceptable channel ID | ||
| // for the given first hop public key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the start I think using the channel id is better and gives more control over liquidity, it also reflects the API for sendpayment. But we could also have the option to specify a pubkey, not sure if that is more convenient in terms of the consumer side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added in some commits to allow for sending via channel ID rather than pubkey. Let me know if the approach looks reasonable.
|
cc: @positiveblue in case you're interested in reviewing this pr |
3bf2f15 to
2900535
Compare
|
Left a few comments in calvinrzachman#17, I think the approach in there looks good! |
2900535 to
e1b56dc
Compare
ellemouton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love the minimal change set!! ✨
mostly just style & structure comments. I think the new subserver is also missing a unit test.
Also just an overall note on commit structure: it would be better to only plug in the new server once it is complete & ready.
so i'd suggest the following structure:
- any refactors required
- add the new package, implement the logic and unit tests
- add proto definitions
- wrapper grpc service that implements the proto defs and calls the new logic.
- now, plug the completed subserver into LND
- now add itests
htlcswitch/switch.go
Outdated
| if deobfuscator == nil { | ||
| return &PaymentResult{ | ||
| EncryptedError: htlc.Reason, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it needs to be explained more clearly why this could be nil. ie, be exlicit about the case we are handling - both in a comment in the code & in the commit message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is brief mention in the godoc comment for the function. Updated to add a small comment in-line as well!
itest/lnd_sendonion_test.go
Outdated
| // TODO(calvin): Other things to check: | ||
| // - Error conditions/handling (server handles with decryptor or caller | ||
| // handles encrypted error blobs from server) | ||
| // - That we successfully convert pubkey --> channel when there are | ||
| // multiple channels, some of which can carry the payment and other | ||
| // which cannot. | ||
| // - Send the same onion again. Send the same onion again but mark it | ||
| // with a different attempt ID. | ||
| // | ||
| // If we send again, our node does forward the onion but the first hop | ||
| // considers it a replayed onion. | ||
| // 2024-05-01 15:54:18.364 [ERR] HSWC: unable to process onion packet: sphinx packet replay attempted | ||
| // 2024-05-01 15:54:18.364 [ERR] HSWC: ChannelLink(a680b373941e2e056e7b98007cc8cee933331e28981474b34d4275bb94cd17fe:0): unable to decode onion hop iterator: InvalidOnionVersion | ||
| // 2024-05-01 15:54:18.364 [DBG] PEER: Peer(0352f454dd5e09cd3e979cbace6fc6727cfa9a1eaa878a452ce63b221f51771a74): Sending UpdateFailMalformedHTLC(chan_id=fe17cd94bb75424db3741498281e3333e9cec87c00987b6e052e1e9473b380a6, id=1, fail_code=InvalidOnionVersion) to 0352f454dd5e09cd3e979cbace6fc6727cfa9a1eaa878a452ce63b221f51771a74@127.0.0.1:63567 | ||
| // If we randomize the payment hash, first hop says bad HMAC. | ||
| // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this be addressed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of it will be handled in the TrackOnion and duplicate send onion tests. Removed the comment to clean things up a bit.
itest/lnd_sendonion_test.go
Outdated
| func testTrackOnion(ht *lntest.HarnessTest) { | ||
| // Create a four-node context consisting of Alice, Bob and two new | ||
| // nodes: Carol and Dave. This will provide a 4 node, 3 channel topology. | ||
| // Alice will make a channel with Bob, and Bob with Carol, and Carol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like it could just be part of the existing send onion test no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that we could make everything into one big test. But I think there might be enough to TrackOnion to merit creating a separate test. For example we can either defer error encryption to the switchrpc server by supplying the ephemeral session key and hop public keys used to construct the onion, or we can handle the onion error decryption on the client side if we wish to for privacy or other reasons.
itest/lnd_sendonion_test.go
Outdated
| // require.Error(ht, err, "expected error when re-sending same onion with same attempt ID") | ||
| // // Assert that the error is a gRPC codes.AlreadyExists error. | ||
| // st, ok := status.FromError(err) | ||
| // require.True(ht, ok, "expected a gRPC status error") | ||
| // require.Equal(ht, codes.AlreadyExists, st.Code(), "expected AlreadyExists error code") | ||
| // // require.Contains(ht, st.Message(), "duplicate onion", "expected error message to indicate duplicate onion") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only add the code when it isnt commented out. can leave the todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or, only add the test once the logic actually does the thing :)
| // - Send different onion but with same attempt ID. | ||
| } | ||
|
|
||
| func testSendOnionTwice(ht *lntest.HarnessTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test doc pls 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also - can we not extend the existing test?
lnrpc/switchrpc/switch_server.go
Outdated
| // scenarios where network requests are reordered. If an attempt ID has | ||
| // already been used by either SendOnion or TrackOnion, SendOnion will | ||
| // return DUPLICATE_HTLC for that attempt ID. | ||
| usedAttemptIDs *roaring64.Bitmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as noted offline, an in-mem solution is not enough to make something idempotent. Will need a persisted solution if we find that we indeed are at rish of duplicate attempts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to remove this in-memory method as we can instead make use of an InitAttempt method to checkpoint some information about the attempt prior to sending it out to the network. That way, we'll have the means to deny subsequent initialization attempts. We can also bury this duplicate safety one layer deeper within the actual Switch itself. This seems somewhat analogous to the InitPayment concept within the Router.
e1b56dc to
038c377
Compare
038c377 to
2c32ac5
Compare
2c32ac5 to
38a26f0
Compare
calvinrzachman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough review! I updated the commit ordering as suggested made sure to save hooking up the Switch RPC server into lnd until the end just before the itests. Let me know if anything was not sufficiently addressed 🙏
lnrpc/switchrpc/switch_server.go
Outdated
| "unable to process shared secrets") | ||
| } | ||
|
|
||
| // NOTE(calvin): In order to decrypt errors server side we require |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I think I got the idea to use this format from the TODOs. I'll remove that for all the NOTEs
htlcswitch/switch.go
Outdated
| if deobfuscator == nil { | ||
| return &PaymentResult{ | ||
| EncryptedError: htlc.Reason, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is brief mention in the godoc comment for the function. Updated to add a small comment in-line as well!
lnrpc/switchrpc/switch_server.go
Outdated
| // scenarios where network requests are reordered. If an attempt ID has | ||
| // already been used by either SendOnion or TrackOnion, SendOnion will | ||
| // return DUPLICATE_HTLC for that attempt ID. | ||
| usedAttemptIDs *roaring64.Bitmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to remove this in-memory method as we can instead make use of an InitAttempt method to checkpoint some information about the attempt prior to sending it out to the network. That way, we'll have the means to deny subsequent initialization attempts. We can also bury this duplicate safety one layer deeper within the actual Switch itself. This seems somewhat analogous to the InitPayment concept within the Router.
itest/lnd_sendonion_test.go
Outdated
| // NOTE(calvin): We may want our wrapper RPC client to allow errors | ||
| // through so that we can make some assertions about them in various | ||
| // scenarios. | ||
| // resp, err := alice.RPC.SendOnion(onionReq) | ||
| // require.NoError(ht, err, "unable to send payment via onion") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough read. Made a pass to cleanup stray comments generally.
itest/lnd_sendonion_test.go
Outdated
| // const ( | ||
| // defaultTimeout = 30 * time.Second | ||
| // ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed 🧼
itest/lnd_sendonion_test.go
Outdated
| func testTrackOnion(ht *lntest.HarnessTest) { | ||
| // Create a four-node context consisting of Alice, Bob and two new | ||
| // nodes: Carol and Dave. This will provide a 4 node, 3 channel topology. | ||
| // Alice will make a channel with Bob, and Bob with Carol, and Carol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that we could make everything into one big test. But I think there might be enough to TrackOnion to merit creating a separate test. For example we can either defer error encryption to the switchrpc server by supplying the ephemeral session key and hop public keys used to construct the onion, or we can handle the onion error decryption on the client side if we wish to for privacy or other reasons.
itest/lnd_sendonion_test.go
Outdated
| // TODO(calvin): Other things to check: | ||
| // - Error conditions/handling (server handles with decryptor or caller | ||
| // handles encrypted error blobs from server) | ||
| // - That we successfully convert pubkey --> channel when there are | ||
| // multiple channels, some of which can carry the payment and other | ||
| // which cannot. | ||
| // - Send the same onion again. Send the same onion again but mark it | ||
| // with a different attempt ID. | ||
| // | ||
| // If we send again, our node does forward the onion but the first hop | ||
| // considers it a replayed onion. | ||
| // 2024-05-01 15:54:18.364 [ERR] HSWC: unable to process onion packet: sphinx packet replay attempted | ||
| // 2024-05-01 15:54:18.364 [ERR] HSWC: ChannelLink(a680b373941e2e056e7b98007cc8cee933331e28981474b34d4275bb94cd17fe:0): unable to decode onion hop iterator: InvalidOnionVersion | ||
| // 2024-05-01 15:54:18.364 [DBG] PEER: Peer(0352f454dd5e09cd3e979cbace6fc6727cfa9a1eaa878a452ce63b221f51771a74): Sending UpdateFailMalformedHTLC(chan_id=fe17cd94bb75424db3741498281e3333e9cec87c00987b6e052e1e9473b380a6, id=1, fail_code=InvalidOnionVersion) to 0352f454dd5e09cd3e979cbace6fc6727cfa9a1eaa878a452ce63b221f51771a74@127.0.0.1:63567 | ||
| // If we randomize the payment hash, first hop says bad HMAC. | ||
| // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of it will be handled in the TrackOnion and duplicate send onion tests. Removed the comment to clean things up a bit.
c113546 to
fa23d48
Compare
fa23d48 to
ccab228
Compare
This will allow a sub-system access to information about the state of a channel link such as forwarding bandwidth, eligibility, etc. while not permitting full control over link function.
a2acef3 to
cd74c0b
Compare
Add RPC for dispatching payments via onions. The payment route and onion are computed by the caller and the onion is delivered to the server for forwarding. NOTE: The server does NOT process or peel the onion so it assumed that the onion will be constructed such that the first hop is encrypted to one of the server's channel partners.
These tests verify that internal errors from the htlcswitch (eg: ErrDuplicateAdd or ErrPaymentIDNotFound) are precisely translated into the specific error codes and messages defined in the `switch.proto` file. This is critical for the remote client, which relies on these exact signals to make important state decisions (e.g., whether to retry a payment). We also confirm that the server validates incoming requests and correctly rejects malformed or incomplete requests. This is important to do for externally provided input to the daemon, even if the users of this RPC server are trusted.
Allow the switch to defer error handling when callers of GetAttemptResult do not provide an error decrypter.
687bf53 to
fdbdcbe
Compare
| repeated bytes hop_pubkeys = 4; | ||
| } | ||
|
|
||
| message TrackOnionResponse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about this, I think you did not use oneof at all might make also sense the other new RPC structs:
message DecryptedError {
string message = 1;
ErrorCode code = 2;
}
message TrackOnionResponse {
oneof result {
bytes preimage = 1; // Success
DecryptedError decrypted_error = 2; // Decrypted failure
bytes encrypted_error = 3; // Encrypted failure
}
}
ziggie1984
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for answering all the previous comments, I still have some additional comments, especially that we maybe make the RPC API more clean by using the optional keyword where applicable and also using the oneof method when having multiple outcomes.
I think it is worth to take the time here for the RPC layer because once shipped we will have all the work to deprecate stuff and cannot easily change the API
|
|
||
| // If neither is provided, the caller wants to handle decryption. This | ||
| // is a valid use case, so we return no decryptor and no error. | ||
| if !sessionKeyProvided && !hopPubkeysProvided { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here this can be accomplished if we make the rpc interface more strict by using oneof or ? can we make the RPC api more strict then we can skip all these checks here: @gemini-code-assist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good suggestion to consider for making the RPC API more strict! The current implementation of buildErrorDecryptor already enforces the "both or neither" semantic for session_key and hop_pubkeys through explicit checks:
- It first checks if both
sessionKeyBytesandhopPubkeysare empty. If so, it correctly interprets this as the caller wanting to handle decryption themselves and returns no decryptor. - It then checks if only one of them is provided (
sessionKeyProvided != hopPubkeysProvided). If this is the case, it returns an error, as both are required for server-side decryption.
While using oneof in the protobuf definition could visually group these fields, it wouldn't entirely eliminate the need for these checks in the Go code. For bytes and repeated bytes fields in protobuf3, an empty value (e.g., []byte{} or [][]byte{}) is still considered a "set" value by the oneof mechanism. Therefore, the len(field) > 0 checks would still be necessary within the chosen oneof option to ensure that the provided data is meaningful for decryption.
The current explicit checks in the Go code are clear and effectively handle the required logic for these optional fields.
| func validateSessionKey(sessionKeyBytes []byte) error { | ||
| const expectedKeyLength = 32 | ||
|
|
||
| // Check length of session key. | ||
| if len(sessionKeyBytes) != expectedKeyLength { | ||
| return fmt.Errorf("invalid session key length: got %d, "+ | ||
| "expected %d", len(sessionKeyBytes), expectedKeyLength) | ||
| } | ||
|
|
||
| // Interpret the key as a big-endian unsigned integer. | ||
| keyValue := new(big.Int).SetBytes(sessionKeyBytes) | ||
|
|
||
| // Check if the key is in the valid range [1, N-1]. | ||
| if keyValue.Sign() <= 0 || keyValue.Cmp(btcec.S256().N) >= 0 { | ||
| return fmt.Errorf("session key is out of range") | ||
| } | ||
|
|
||
| return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely violation of boundaries, having this fine granular crypto on this level of the code is not good design we should rather update the crypto package and then do something like this:
// btcec library should have this
func ValidatePrivKeyBytes(pk []byte) error {
if len(pk) != 32 {
return ErrInvalidKeyLen
}
keyValue := new(big.Int).SetBytes(pk)
if keyValue.Sign() <= 0 || keyValue.Cmp(S256().N) >= 0 {
return ErrKeyOutOfRange
}
return nil
}
Then the domain layer (htlcswitch) should use it:
// In htlcswitch, not RPC layer
privKey, pubKey := btcec.PrivKeyFromBytes(sessionKeyBytes)
if err := btcec.ValidatePrivKeyBytes(sessionKeyBytes); err != nil {
return fmt.Errorf("invalid session key: %w", err)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm up to try to stick this into the lower level package. Should we defer that to a follow up PR so we don't need to wait on a new package release? I have a branch to update btcec here: calvinrzachman/btcd#1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
followup is OK, but let's remove this check from this PR.
lnrpc/switchrpc/switch.proto
Outdated
| uint64 attempt_id = 2; | ||
|
|
||
| // Optional: Session key used to generate the onion/sphinx packet. | ||
| bytes session_key = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to use oneof from the grpc package for these two components which are optional @gemini-code-assist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an excellent suggestion! You're absolutely right that oneof is the perfect construct for first_hop_pubkey and first_hop_chan_id in the SendOnionRequest message.
Using oneof would formally enforce the mutual exclusivity that's currently handled by application logic, making the API contract clearer and preventing invalid states at the protobuf level. It also simplifies validation for clients and can lead to cleaner generated code.
I've prepared a code suggestion to implement this change in lnrpc/switchrpc/switch.proto:
message SendOnionRequest {
// The raw onion packet to be sent across the network.
bytes onion_blob = 1;
// The attempt ID uniquely identifying this payment attempt. The caller can
// expect to track results for the payment via this attempt ID.
uint64 attempt_id = 2;
// The total amount in millisatoshis required to complete a payment over
// this route. This value includes the cumulative fees at each hop. The HTLC
// extended to the first-hop in the route will need to have at least this
// many (milli)satoshis.
int64 amount = 5;
// The absolute timelock or CLTV value that should be extended to the first
// hop in the route. All other hops will decrement the time-lock as
// described by the onion.
uint32 timelock = 6;
// The payment hash associated with the HTLC. This is needed for tracking
// and debugging purposes.
bytes payment_hash = 7;
// The optional blinding point to use when constructing the HTLC.
// If set, this will be used to blind the next hop in the route.
bytes blinding_point = 8;
// Optional custom TLV records to include in the UpdateAddHTLC message.
map<uint64, bytes> custom_records = 9;
// Optional opaque extra data to attach to the UpdateAddHTLC message.
bytes extra_data = 10;
oneof first_hop {
// The first hop's public key where the onion will be sent.
// The first hop must be specified either by channel ID or public key.
bytes first_hop_pubkey = 3;
// The channel id of the channel that must be taken to the first hop.
// The first hop must be specified either by channel ID or public key.
uint64 first_hop_chan_id = 4;
}
}This change would require regenerating the protobuf code and updating the SendOnion server-side logic to use the first_hop oneof field correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this along side the TrackOnionResponse updates in a follow up PR if that's okay.
lnrpc/switchrpc/switch.proto
Outdated
|
|
||
| // The optional blinding point to use when constructing the HTLC. | ||
| // If set, this will be used to blind the next hop in the route. | ||
| bytes blinding_point = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's start to use optional keyword for rpc proto ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the optional directive.
| // Now that the original HTLC attempt has settled, we'll send the same | ||
| // onion again with the same attempt ID. | ||
| // | ||
| // NOTE: Currently, this does not error. When we make SendOnion fully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm why does this not error tho, because the result should still be in the result store, we haven't restarted yet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh this is because the CircuitMap on its own doesn't defend against duplicates after the settle/fail result is back from the network and the circuit is torn down. That is partly why we want to use InitAttempt. So that even if the result is in the store, an RPC client retrying SendOnion (possibly due to timeouts or ErrDuplicateAdd response from the server being delayed or lost) is not at any risk of causing duplicate attempts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but what I do not understand, this PR builds already ontop of the initAttempt PR or ? Have you rebased this PR on the current base branch ?
lnrpc/switchrpc/switch.proto
Outdated
| // expect to track results for the payment via this attempt ID. | ||
| uint64 attempt_id = 2; | ||
|
|
||
| // Optional: Session key used to generate the onion/sphinx packet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is correct, because we only need these for the error decryptor rather then creating an onion sphinx package or ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, can see how this comment might be confusing. While this key's purpose in TrackOnion is for decryption, it is not arbitrary; it is the exact same cryptographic material used to construct the onion. If it does not match, then I think decryption is impossible. Only the creator of the onion can decrypt the forwarding errors.
Updated the comment to better reflect this.
Adds the TrackOnion RPC to the switchrpc service. This allows a caller to subscribe to the final outcome (settle or fail) of a specific HTLC attempt. This RPC is designed to be called after a successful dispatch has been confirmed via the SendOnion RPC. It should not be used to determine whether an HTLC dispatch was received in an ambiguous network scenario. That ambiguity must be resolved by retrying the idempotent SendOnion RPC until a definitive acknowledgement is received. Once dispatch is confirmed, TrackOnion provides the mechanism to wait for the result of the in-flight HTLC. The RPC allows callers to specify whether error decryption should be handled by the server or performed by the client, providing flexibility for different error handling strategies.
This will allow us to leverage this function from the Switch RPC server's BuildOnion implementation.
This will allow us to leverage this function from the Switch RPC server's BuildOnion implementation.
Add RPC which constructs a sphinx onion packet for the given payment route. NOTE: This is added primarily to aid with the itests added later.
This plugs in the Switch RPC server to the rest of lnd. The service will be available for use.
Update so that "make unit-cover" uses tags in a manner consistent with the rest of our unit testing.
This demonstrates how the Switch and SendOnion rpc behave when asked to dispatch duplicate onions. Notably, the Switch circuit map detects this - but only if the matching onion is still in flight. Once the circuit is torn down, the duplicate is permitted by the Switch. It is likely that we will add a layer of protection to the SendOnion call itself to prevent duplicates even after the first HTLC is no longer in-flight.
We declare each service's REST annotations in its own file. This is optional in the v1 but mandatory when using v2 of the grpc-gateway/v2 library.
Update the Switch RPC protos to make use of the 'optional' directive. Though this may not impact the generated types or how the user interacts with these types, it may serve to document the fact that they are optional a bit better.
fdbdcbe to
946a221
Compare
ziggie1984
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great back and forth, congratulations on the PR 🎉
Please submit the follow-up PRs soon!
307e665
into
lightningnetwork:elle-base-branch-payment-service
Change Description
We add a new
switchrpcRPC sub-system with SendOnion, BuildOnion, and TrackOnion RPCs. This allows the daemon to offload path-finding, onion construction and payment life-cycle management to an external entity (such as a remotely instantiated ChannelRouter type) and instead accept onion payments for direct delivery to the network.switchrpc.Avoiding Duplicate Payment Attempts
We are making send/track(onion) requests which traverse an async and unreliable network. Clients which use these RPCs to make decisions about whether to make additional payment attempts run the risk of a race/re-ordering of request processing misleading them into making a re-attempt when such a re-attempt is not safe to make. We'd like to prevent duplicate payment attempts and unintentional loss of funds by RPC clients.
Consider the following scenario:
DeadlineExceededor serviceUnavailableerror and is unable to distinguish between the request never reaching the server (eg: the server is offline --> safe to re-attempt via different server) and the server receiving the request and being unable to respond in time.Future
InitAttemptstyle method on the Switch store. All duplicates with same attempt ID would be rejected until the result for that attempt ID has been read and cleaned from the result store. Then the attempt ID can be freed for re-use.