Jeffrey Czyz [Thu, 20 Jun 2024 16:43:57 +0000 (11:43 -0500)]
Elide metadata from Offer with derived keys
When an Offer uses blinded paths, its metadata consists of a nonce used
to derive its signing keys. Now that the blinded paths contain this
nonce, elide the metadata as it is now redundant. This saves space and
also makes it impossible to derive the signing keys if an invoice
request is received with the incorrect nonce. The nonce shouldn't be
revealed in this case either to prevent de-anonymization attacks.
When an InvoiceRequest is handled with an OfferContext, use the
containing nonce to verify that it is for a valid Offer. Otherwise, fall
back to using Offer::metadata, which also contains the nonce. The latter
is useful for supporting offers without blinded paths or those created
prior to including an OffersContext in their blinded paths.
To authenticate that an InvoiceRequest is for a valid Offer, include the
nonce from the Offer::metadata in the Offer::paths. This can be used to
prevent de-anonymization attacks where an attacker sends requests using
self-constructed paths to nodes near the Offer::paths' introduction
nodes.
Metadata is an internal type used within Offer messages. For any
constructed message, Metadata::Bytes is always used. The other variants
are used during construction or verification time. Document this and
debug_assert!(false) accordingly.
Jeffrey Czyz [Thu, 20 Jun 2024 00:18:26 +0000 (19:18 -0500)]
Add InvoiceRequest::verify_using_recipient_data
Invoice requests are authenticated by checking the metadata in the
corresponding offer. For offers using blinded paths, this will simply be
a 128-bit nonce. Allows checking this nonce explicitly instead of the
metadata. This will be used by an upcoming change that includes the
nonce in the offer's blinded paths instead of the metadata, which
mitigate de-anonymization attacks.
Jeffrey Czyz [Wed, 19 Jun 2024 17:42:16 +0000 (12:42 -0500)]
Pass Nonce directly to OfferBuilder
When using OfferBuilder::deriving_signing_pubkey, the nonce generated
needs to be the same one included in any OfferBuilder::paths. This is
because the nonce is used along with the offer TLVs to derive a signing
pubkey and will soon be elided from the metadata entirely.
Jeffrey Czyz [Wed, 19 Jun 2024 16:51:53 +0000 (11:51 -0500)]
Move Nonce to a separate offers sub-module
Nonce is used when constructing Offer::metadata and will soon be need
when constructing BlindedPath for use in authentication. Move it to
separate module now that it is public and will be more widely used.
Jeffrey Czyz [Wed, 19 Jun 2024 16:29:20 +0000 (11:29 -0500)]
Change Nonce visibility to pub
A nonce is generated in OfferBuilder::deriving_signing_pubkey from an
EntropySource for use in Offer::metadata. The same nonce will need to be
included as recipient data in any blinded paths in the Offer. Increase
the visibility to allow for this.
Elias Rohrer [Tue, 2 Jul 2024 10:04:33 +0000 (12:04 +0200)]
Handle fallible events in `OnionMessenger`
Previously, we would just fire-and-forget in `OnionMessenger`'s event
handling. Since we now introduced the possibility of event handling
failures, we here adapt the event handling logic to retain any
events which we failed to handle to have them replayed upon the next
invocation of `process_pending_events`/`process_pending_events_async`.
Elias Rohrer [Mon, 15 Apr 2024 08:35:52 +0000 (10:35 +0200)]
Make event handling fallible
Previously, we would require our users to handle all events
successfully inline or panic will trying to do so. If they would exit
the `EventHandler` any other way we'd forget about the event and
wouldn't replay them after restart.
Here, we implement fallible event handling, allowing the user to return
`Err(())` which signals to our event providers they should abort event
processing and replay any unhandled events later (i.e., in the next
invocation).
Elias Rohrer [Tue, 2 Jul 2024 09:11:40 +0000 (11:11 +0200)]
Hold sep. Mutexes for pending `intercepted_msgs`/`peer_connected` events
This is a minor refactor that will allow us to access the individual
event queue Mutexes separately, allowing us to drop the locks earlier
when processing them individually.
Matt Corallo [Mon, 8 Jul 2024 19:21:19 +0000 (19:21 +0000)]
Make `impl_writeable_tlv_based_enum*` actually upgradable
In cc78b77c715d6ef62693d4c1bc7190da990ec0fa it was discovered that
`impl_writeable_tlv_based_enum_upgradable` wasn't actually
upgradable - tuple variants weren't written with length-prefixes,
causing downgrades with new tuple variants to be unreadable by
older clients as they wouldn't know where to stop reading.
This was fixed by simply assuming that any new variants will be
non-tuple variants with a length prefix, but no code write-side
changes were made, allowing new code to freely continue to use the
broken tuple-variant serialization.
Here we address this be defining yet more serialization macros
which aren't broken, and convert existing usage of the existing
macros using non-length-prefixed tuple variants to renamed
`*_legacy` macros.
Note that this changes the serialization format of
`impl_writeable_tlv_based_enum[_upgradable]` when tuple fields are
written, and as such deliberately changes the call semantics for
such tuples.
Only the serialization format of `MessageContext` is changed here
which is fine as it has not yet reached a release of LDK.
Elias Rohrer [Tue, 16 Jul 2024 08:28:58 +0000 (10:28 +0200)]
Clarify maximum length of an invoice description
We previously stated in the docs that the invoice description can be at most `1023`
bytes long, which is wrong. According to BOLT 11 it's at most 1023*5 bits (639 bytes) long.
Alec Chen [Sun, 30 Jun 2024 23:05:48 +0000 (16:05 -0700)]
Test async get per commitment point for revoke_and_ack
Note: this does not test the CS -> RAA resend ordering, because this
requires handling async get_per_commitment_point for channel
reestablishment, which will be addressed in a follow up PR.
Alec Chen [Wed, 5 Jun 2024 22:40:38 +0000 (17:40 -0500)]
Change get_per_commitment_point to return result type
Includes simple changes to test util signers and tests, as well as
handling the error case for get_per_commitment_point in
HolderCommitmentPoint. This leaves a couple `.expect`s in places
that will be addressed in a separate PR for handling funding.
Matt Corallo [Mon, 15 Jul 2024 18:34:04 +0000 (18:34 +0000)]
Drop unnecessary `strict` feature from `lightning-invoice`
99aa6e27f616c96dda2b49d09bafbc0b982251e0 detected that we had an
undefined feature in `lightning-invoice` called `strict`, which was
used to turn on `deny(warnings)`. It resolved that by adding the
feature to the `Cargo.toml`, but we actually don't need it - our CI
already builds with `-Dwarnings`, so any warnings should be
rejected during CI and there's not a lot of value in having a
(public) feature to do the same.
Duncan Dean [Fri, 12 Jul 2024 09:23:50 +0000 (11:23 +0200)]
Use native check-cfg lint in cargo beta
This uses the newly introduced conditional configuration checks that are
now configurable withint Cargo (beta).
This allows us to get rid of our custom python script that checks for
expected features and cfgs.
This does introduce a warning regarding the unknown lint in Cargo
versions prior to the current beta, but since these are not rustc errors,
they won't break any builds with the "-D warnings" RUSTFLAG.
Moving to this lint actually exposed the "strict" feature not being
present in the lightning-invoice crate, as our python script didnt
correctly parse the cfg_attr where it appeared.
shaavan [Mon, 10 Jun 2024 13:45:19 +0000 (19:15 +0530)]
Expand `create_blinded_path` Functionality for Enhanced Path Diversification
- Previously, the `create_blinded_path` function was limited to
returning a single `BlindedPath`, which restricted the usage of
`blinded_paths`.
- This commit extends the `create_blinded_path` function to return
the entire blinded path vector generated by the `MessageRouter`'s
`create_blinded_paths`.
- The updated functionality is integrated across the codebase, enabling
the sending of Offers Response messages, such as `InvoiceRequest`
(in `pay_for_offer`) and `Invoice` (in `request_refund_payment`),
utilizing multiple reply paths.
Matt Corallo [Sat, 9 Dec 2023 21:22:52 +0000 (21:22 +0000)]
Layout channel info to ensure routing uses cache lines well
Because we scan per-channel information in the hot inner loop of
our routefinding immediately after looking a channel up in a
`HashMap`, we end up spending a nontrivial portion of our
routefinding time waiting on memory to be read in.
While there is only so much we can do about that, ensuring the
channel information that we care about is sitting on one or
adjacent cache lines avoids paying that penalty twice. Thus, here
we manually lay out `ChannelInfo` and `ChannelUpdateInfo` and set
them to 128b and 32b alignment, respectively. This wastes some
space in memory in our network graph, but improves routing
performance in return.
Matt Corallo [Fri, 8 Dec 2023 19:43:17 +0000 (19:43 +0000)]
Consolidate `candidate` access in `add_entry` during routing
Because fetching fields from the `$candidate` often implies an
indirect read, grouping them together may result in one or two
fewer memory loads, so we do so here.
Matt Corallo [Fri, 8 Dec 2023 05:44:32 +0000 (05:44 +0000)]
Somewhat optimize the generic `Features::requires_unknown_bits`
It turns out we spend several percent of our routefinding time just
checking if nodes and channels require unknown features
byte-by-byte. While the cost is almost certainly dominated by the
memory read latency, avoiding doing the checks byte-by-byte should
reduce the branch count slightly, which may reduce the overhead.
Matt Corallo [Fri, 8 Dec 2023 01:49:20 +0000 (01:49 +0000)]
Store source/target `node_counter`s in `DirectionalChannelInfo`
Because we now have some slack space in `PathBuildingHop`, we can
use it to cache some additional hot values. Here we use it to
cache the source and target `node_counter`s for public channels,
effectively prefetching the values from the channel state.
Matt Corallo [Fri, 8 Dec 2023 01:49:08 +0000 (01:49 +0000)]
DRY redundant calls to `$candidate.htlc_minimum_msat()` in routing
While LLVM should inline and elide the redundant calls, because the
router is rather large LLVM can decide against inlining in some
cases where it would be an nice win.
Thus, its worth DRY'ing the redundant calls explicitly.
Matt Corallo [Sun, 10 Dec 2023 03:28:37 +0000 (03:28 +0000)]
Cache whether a node is a first-hop target in the per-node state
When processing the main loop during routefinding, for each node,
we check whether it happens to be our peer in one of our channels.
This ensures we never fail to find a route that takes a hop through
a private channel of ours, to a private node, then through
invoice-provided route hints to reach the ultimate payee.
Because this is incredibly hot code, doing a full `HashMap` lookup
to check if each node is a first-hop target ends up eating a good
chunk of time during routing. Luckily, we can trivially avoid this
cost.
Because we're already looking up the per-node state in the `dist`
map, we can store a bool in each first-hop target's state, avoiding
the lookup unless we know its going to succeed.
This requires storing a dummy entry in `dist`, which feels somewhat
strange, but is ultimately fine as we should never be looking at
per-node state unless we've already found a path to that node,
updating the fields in doign so.
Matt Corallo [Wed, 10 Jul 2024 01:22:32 +0000 (01:22 +0000)]
Move blinded path introduction point resolution to a helper method
This marginally reduces the size of `get_route` by moving a the
blinded path introduction point resolution and blinded path checks
into a helper method.
Matt Corallo [Thu, 7 Dec 2023 23:40:26 +0000 (23:40 +0000)]
Align `PathBuildingHop` to 128b, now that we store them in a `Vec`
Now that `PathBuildingHop` is stored in a `Vec` (as `Option`s),
rather than `HashMap` entries, they can grow to fill a full two
cache lines without a memory access performance cost. In the next
commit we'll take advantage of this somewhat, but here we update
the assertions and drop the `repr(C)`, allowing rust to lay the
memory out as it wishes.
Matt Corallo [Mon, 24 Jun 2024 23:50:29 +0000 (23:50 +0000)]
Drop the `dist` `HashMap` in routing, replacing it with a `Vec`.
Now that we have unique, dense, 32-bit identifiers for all the
nodes in our network graph, we can store the per-node information
when routing in a simple `Vec` rather than a `HashMap`. This avoids
the overhead of hashing and table scanning entirely, for a nice
"simple" optimization win.
Matt Corallo [Sat, 1 Jun 2024 22:45:04 +0000 (22:45 +0000)]
Use `NodeCounters` `NodeId`s as the blinded path intro references
The router's `introduction_node_id_cache` is used to cache the
`NodeId`s of blinded path introduction points so that we don't
have to look them up every time we go around the main router loop.
When using it, if the introduction point isn't a public node we
then look up the introduction in our first-hops map.
In either case, we have to end up with a reference to a `NodeId`
that outlives our `dist` map.
Here we consolidate both the initial cache building and the
first-hops map lookup to one place, storing only a reference to a
`NodeId` either in the `NetworkGraph` or in the new `NodeCounters`
to get the required lifetime without needing to reference into the
first-hops map.
We then take this opportunity to avoid `clone`ing the first-hops
map entries as we now no longer reference into it.
Matt Corallo [Sat, 1 Jun 2024 18:19:45 +0000 (18:19 +0000)]
Drop `private_hop_key_cache` in favor of `NodeCounters`
With the new `NodeCounters` have have a all the `NodeId`s we'll
need during routing, so there's no need to keep the
`private_hop_key_cache` which existed to provide references to
`NodeId`s which are needed during routing.
Matt Corallo [Tue, 19 Mar 2024 19:29:06 +0000 (19:29 +0000)]
Add a new `NodeCounters` utility to track counters when routing
In the next commit we'll stop using `NodeId`s to look up nodes when
routing, instead using the new per-node counters. Here we take the
first step, adding a local struct which tracks temporary counters
for route hints/source/destination nodes.
Because we must ensure we have a 1-to-1 mapping from node ids to
`node_counter`s, even across first-hop and last-hop hints, we have
to be careful to check the network graph first, then a new
`private_node_id_to_node_counter` map to ensure we only ever end up
with one counter per node id.
shaavan [Sat, 15 Jun 2024 13:52:56 +0000 (19:22 +0530)]
Allow create_blinded_paths functions to accept MessageContext as input field
- Enabled `create_blinded_paths` to accept `MessageContext` TLVs as
an input field.
- `MessageContext` is intended to be sent along with the `reply_path`
to the counterparty.
- Added `MessageContext` in the `create_blinded_paths` flow, optionally
appending it within the `reply_path`.
- Updated tests to verify the new feature.
shaavan [Sat, 15 Jun 2024 13:52:28 +0000 (19:22 +0530)]
Update handle_message to accept OffersContext data as an input field
1. Handling Offers Data:
- Updated `handle_message` to accept `OffersContext` data as an input field.
- If it is present, it will be utilized by the handler to
abandon outbound payments that have failed for any reason.
2. Consistency in Custom Message Handling:
- Updated `handle_custom_message` to accept optional custom data.
for consistency.
- Note: `custom_data` will remain unused in this PR.
shaavan [Fri, 14 Jun 2024 12:36:05 +0000 (18:06 +0530)]
Introduce MessageContext in ReceiveTlvs
1. New Enum for Enhanced Data Handling:
- Introduced the `MessageContext` enum, which allows the node to include
additional context data along with the `reply_path` sent to the counterparty.
- The node anticipates receiving this data back for further processing.
2. Variants in MessageContext:
- The `MessageContext` enum includes two variants: "Offers" and
"Context"
- One of the variants, `Offers`, holds the `payment_id`
of the associated Outbound BOLT12 Payment.
3. Future Usage:
- This enum will be utilized in a subsequent commit to abandon outbound
payments that have failed to complete for various reasons.
Matt Corallo [Thu, 13 Jun 2024 00:29:01 +0000 (00:29 +0000)]
Block monitor updates to ensure preimages are in each MPP part
If we claim an MPP payment and only persist some of the
`ChannelMonitorUpdate`s which include the preimage prior to
shutting down, we may be in a state where some of our
`ChannelMonitor`s have the preimage for a payment while others do
not.
This, it turns out, is actually mostly safe - on startup
`ChanelManager` will detect there's a payment it has as unclaimed
but there's a corresponding payment preimage in a `ChannelMonitor`
and go claim the other MPP parts. This works so long as the
`ChannelManager` has been persisted after the payment has been
received but prior to the `PaymentClaimable` event being processed
(and the claim itself occurring). This is not always true and
certainly not required on our API, but our
`lightning-background-processor` today does persist prior to event
handling so is generally true subject to some race conditions.
In order to address this we need to use copy payment preimages
across channels irrespective of the `ChannelManager`'s payment
state, but this introduces another wrinkle - if one channel makes
substantial progress while other channel(s) are still waiting to
get the payment preimage in `ChannelMonitor`(s) while the
`ChannelManager` hasn't been persisted after the payment was
received, we may end up without the preimage on disk.
Here, we address this issue by using the new
`RAAMonitorUpdateBlockingAction` variant for this specific case. We
block persistence of an RAA `ChannelMonitorUpdate` which may remove
the preimage from disk until all channels have had the preimage
added to their `ChannelMonitor`.
We do this only in-memory (and not on disk) as we can recreate this
blocker during the startup re-claim logic.
This will enable us to claim MPP parts without using the
`ChannelManager`'s payment state in later work.
Matt Corallo [Tue, 14 May 2024 00:03:39 +0000 (00:03 +0000)]
Add a `RAAMonitorUpdateBlockingAction::ClaimedMPPPayment`
If we claim an MPP payment and only persist some of the
`ChannelMonitorUpdate`s which include the preimage prior to
shutting down, we may be in a state where some of our
`ChannelMonitor`s have the preimage for a payment while others do
not.
This, it turns out, is actually mostly safe - on startup
`ChanelManager` will detect there's a payment it has as unclaimed
but there's a corresponding payment preimage in a `ChannelMonitor`
and go claim the other MPP parts. This works so long as the
`ChannelManager` has been persisted after the payment has been
received but prior to the `PaymentClaimable` event being processed
(and the claim itself occurring). This is not always true and
certainly not required on our API, but our
`lightning-background-processor` today does persist prior to event
handling so is generally true subject to some race conditions.
In order to address this race we need to use copy payment preimages
across channels irrespective of the `ChannelManager`'s payment
state, but this introduces another wrinkle - if one channel makes
substantial progress while other channel(s) are still waiting to
get the payment preimage in `ChannelMonitor`(s) while the
`ChannelManager` hasn't been persisted after the payment was
received, we may end up without the preimage on disk.
Here, we address this issue with a new
`RAAMonitorUpdateBlockingAction` variant for this specific case. We
block persistence of an RAA `ChannelMonitorUpdate` which may remove
the preimage from disk until all channels have had the preimage
added to their `ChannelMonitor`.
We do this only in-memory (and not on disk) as we can recreate this
blocker during the startup re-claim logic.
This will enable us to claim MPP parts without using the
`ChannelManager`'s payment state in later work.
Matt Corallo [Wed, 12 Jun 2024 20:49:07 +0000 (20:49 +0000)]
Track the cp `node_id` which originated an HTLC in the `HTLCSource`
Because we track pending `ChannelMonitorUpdate`s per-peer, we
really need to know what peer an HTLC came from when we go to claim
it on-chain, allowing us to look up any blocked actions in the
`PeerState`. While we could do this by moving the blocked actions
to some global "closed-channel update queue", its simpler to do it
this way.
While this trades off `ChannelMonitorUpdate` size somewhat (the
`HTLCSource` is included in many of them), which we should be
sensitive to, it will also allow us to (eventually) remove the
SCID -> peer + channel_id lookups we do when claiming or failing
today, which are somewhat annoying to deal with.