Prior to this, we returned PaymentSendFailure from auto retry send payment
methods. This implied that we might return a PartialFailure from them, which
has never been the case. So it makes sense to rework the errors to be a better
fit for the methods.
We're taking error handling in a totally different direction now to make it
more asynchronous, see send_payment_internal for more information.
Matt Corallo [Fri, 3 Feb 2023 23:05:58 +0000 (23:05 +0000)]
Fix (and test) threaded payment retries
The new in-`ChannelManager` retries logic does retries as two
separate steps, under two separate locks - first it calculates
the amount that needs to be retried, then it actually sends it.
Because the first step doesn't udpate the amount, a second thread
may come along and calculate the same amount and end up retrying
duplicatively.
Because we generally shouldn't ever be processing retries at the
same time, the fix is trivial - simply take a lock at the top of
the retry loop and hold it until we're done.
Matt Corallo [Tue, 7 Feb 2023 19:46:08 +0000 (19:46 +0000)]
Test if a given mutex is locked by the current thread in tests
In anticipation of the next commit(s) adding threaded tests, we
need to ensure our lockorder checks work fine with multiple
threads. Sadly, currently we have tests in the form
`assert!(mutex.try_lock().is_ok())` to assert that a given mutex is
not locked by the caller to a function.
The fix is rather simple given we already track mutexes locked by a
thread in our `debug_sync` logic - simply replace the check with a
new extension trait which (for test builds) checks the locked state
by only looking at what was locked by the current thread.
We're no longer supporting manual retries since
ChannelManager::send_payment_with_retry can be parameterized by a retry
strategy
This commit also updates all docs related to retry_payment and abandon_payment.
Since these docs frequently overlap with changes in preceding commits where we
start abandoning payments on behalf of the user, all the docs are updated in
one go.
Jeffrey Czyz [Tue, 31 Jan 2023 16:44:19 +0000 (10:44 -0600)]
Re-write CustomMessageHandler documentation
Documentation for CustomMessageHandler wasn't clear how it is related to
PeerManager and contained some grammatical and factual errors. Re-write
the docs and link to the lightning_custom_message crate.
Jeffrey Czyz [Tue, 3 Jan 2023 17:24:30 +0000 (11:24 -0600)]
Macro for composing custom message handlers
BOLT 1 specifies a custom message type range for use with experimental
or application-specific messages. While a `CustomMessageHandler` can be
defined to support more than one message type, defining such a handler
requires a significant amount of boilerplate and can be error prone.
Add a crate exporting a `composite_custom_message_handler` macro for
easily composing pre-defined custom message handlers. The resulting
handler can be further composed with other custom message handlers using
the same macro.
This requires a separate crate since the macro needs to support "or"
patterns in macro_rules, which is only available in edition 2021.
Otherwise, a crate defining a handler for a set of custom messages could
not easily be reused with another custom message handler. Doing so would
require explicitly duplicating the reused handlers type ids, but those
may change when the crate is updated.
When a peer disconnects but still has channels, the peer's `peer_state`
entry in the `per_peer_state` is not removed by the `peer_disconnected`
function. If the channels with that peer is later closed while still
being disconnected (i.e. force closed), we therefore need to remove the
peer from `peer_state` separately.
To remove the peers separately, we push such peers to a separate HashSet
that holds peers awaiting removal, and remove the peers on a timer to
limit the negative effects on parallelism as much as possible.
Updates multiple instances of the `ChannelManager` docs related to the
previous change that moved the storage of the channels to the
`per_peer_state`. This docs update corrects some grammar errors and
incorrect information, as well as clarifies documentation that was
confusing.
Elias Rohrer [Wed, 23 Nov 2022 08:33:37 +0000 (09:33 +0100)]
Add transaction sync crate
This crate provides utilities for syncing LDK via the transaction-based
`Confirm` interface. The initial implementation facilitates
synchronization with an Esplora backend server.
Matt Corallo [Mon, 30 Jan 2023 17:56:46 +0000 (17:56 +0000)]
Don't apply gossip backpressure to non-channel-announcing peers
When we apply the new gossip-async-check backpressure on peer
connections, if a peer has never sent us a `channel_announcement`
at all, we really shouldn't delay reading their messages.
This does so by tracking, on a per-peer basis, whether they've sent
us a channel_announcement, and resetting that state whenever we're
not backlogged.
Matt Corallo [Sun, 22 Jan 2023 18:08:33 +0000 (18:08 +0000)]
Apply backpressure when we have too many gossip checks in-flight
Now that the `RoutingMessageHandler` can signal that it needs to
apply message backpressure, we implement it here in the
`PeerManager`. There's not much complicated here, aside from noting
that we need to add the ability to call `send_data` with no data
to indicate that reading should resume (and track when we may need
to make such calls when updating the routing-backpressure state).
Matt Corallo [Sun, 22 Jan 2023 05:12:45 +0000 (05:12 +0000)]
Allow `RoutingMessageHandler` to signal backpressure
Now that we allow `handle_channel_announcement` to (indirectly)
spawn async tasks which will complete later, we have to ensure it
can apply backpressure all the way up to the TCP socket to ensure
we don't end up with too many buffers allocated for UTXO
validation.
We do this by adding a new method to `RoutingMessageHandler` which
allows it to signal if there are "many" checks pending and
`channel_announcement` messages should be delayed. The actual
`PeerManager` implementation thereof is done in the next commit.
Matt Corallo [Sun, 22 Jan 2023 04:14:58 +0000 (04:14 +0000)]
Forward gossip messages which were verified asynchronously
Gossip messages which were verified against the chain
asynchronously should still be forwarded to peers, but must now go
out via a new `P2PGossipSync` parameter in the
`AccessResolver::resolve` method, allowing us to wire them up to
the `P2PGossipSync`'s `MessageSendEventsProvider` implementation.
Matt Corallo [Sun, 22 Jan 2023 03:41:28 +0000 (03:41 +0000)]
Add the ability to broadcast gossip msgs via the event pipeline
When we process gossip messages asynchronously we may find that we
want to forward a gossip message to a peer after we've returned
from the existing `handle_*` method. In order to do so, we need to
be able to send arbitrary loose gossip messages back to the
`PeerManager` via `MessageSendEvent`.
This commit modifies `MessageSendEvent` in order to support this.
Matt Corallo [Tue, 7 Feb 2023 20:38:20 +0000 (20:38 +0000)]
Process `channel_update`/`node_announcement` async if needed
If we have a `channel_announcement` which is waiting on a UTXO
lookup before we can process it, and we receive a `channel_update`
or `node_announcement` for the same channel or a node which is a
part of the channel, we have to wait until the lookup completes
until we can decide if we want to accept the new message.
Here, we store the new message in the pending lookup state and
process it asynchronously like the original `channel_announcement`.
Matt Corallo [Wed, 8 Feb 2023 22:11:56 +0000 (22:11 +0000)]
Track in-flight `channel_announcement` lookups and avoid duplicates
If we receive two `channel_announcement`s for the same channel at
the same time, we shouldn't spawn a second UTXO lookup for an
identical message. This likely isn't too rare - if we start syncing
the graph from two peers at the same time, it isn't unlikely that
we'll end up with the same messages around the same time.
In order to avoid this we keep a hash map of all the pending
`channel_announcement` messages by SCID and simply ignore duplicate
message lookups.
Matt Corallo [Wed, 8 Feb 2023 22:06:11 +0000 (22:06 +0000)]
Add an async resolution option to `ChainAccess::get_utxo`
For those operating in an async environment, requiring
`ChainAccess::get_utxo` return information about the requested UTXO
synchronously is incredibly painful. Requesting information about a
random UTXO is likely to go over the network, and likely to be a
rather slow request.
Thus, here, we change the return type of `get_utxo` to have both a
synchronous and asynchronous form. The asynchronous form requires
the user construct a `AccessFuture` which they `clone` and pass
back to us. Internally, an `AccessFuture` has an `Arc` to the
`channel_announcement` message which we need to process. When the
user completes their lookup, they call `resolve` on their
`AccessFuture` which we pull the `channel_announcement` from and
then apply to the network graph.
Matt Corallo [Sat, 21 Jan 2023 03:28:35 +0000 (03:28 +0000)]
Move `chain::Access` to `routing` and rename it `UtxoLookup`
The `chain::Access` trait (and the `chain::AccessError` enum) is a
bit strange - it only really makes sense if users import it via the
`chain` module, otherwise they're left with a trait just called
`Access`. Worse, for bindings users its always just called
`Access`, in part because many downstream languages don't have a
mechanism to import a module and then refer to it.
Further, its stuck dangling in the `chain` top-level mod.rs file,
sitting in a module that doesn't use it at all (it's only used in
`routing::gossip`).
Instead, we give it its full name - `UtxoLookup` (and rename the
error enum `UtxoLookupError`) and put it in the a new
`routing::utxo` module, next to `routing::gossip`.
Alec Chen [Mon, 6 Feb 2023 21:33:02 +0000 (15:33 -0600)]
Swap `PublicKey` for `NodeId` in `UnsignedNodeAnnouncement`
Also swaps `PublicKey` for `NodeId` in `get_next_node_announcement`
and `InitSyncTracker` to avoid unnecessary deserialization that came
from changing `UnsignedNodeAnnouncement`.
Alec Chen [Mon, 6 Feb 2023 20:41:18 +0000 (14:41 -0600)]
Swap `PublicKey` for `NodeId` in `UnsignedChannelAnnouncement`
Adds the macro `get_pubkey_from_node_id`
to parse `PublicKey`s back from `NodeId`s for signature
verification, as well as `make_funding_redeemscript_from_slices`
to avoid parsing back and forth between types.
jurvis [Mon, 5 Dec 2022 01:24:08 +0000 (17:24 -0800)]
Expose pending payments through ChannelManager
Adds a new method, `list_recent_payments ` to `ChannelManager` that
returns an array of `RecentPaymentDetails` containing the payment
status (Fulfilled/Retryable/Abandoned) and its total amount across all
paths.
Matt Corallo [Sat, 28 Jan 2023 02:14:26 +0000 (02:14 +0000)]
Use only the failed amount when retrying payments, not the full amt
When we landed the initial in-`ChannelManager` payment retries, we
stored the `RouteParameters` in the payment info, and then re-use
it as-is for new payments. `RouteParameters` is intended to store
the information specific to the *route*, `PaymentParameters` exists
to store information specific to a payment.
Worse, because we don't recalculate the amount stored in the
`RouteParameters` before fetching a new route with it, we end up
attempting to retry the full payment amount, rather than only the
failed part.
This issue brought to you by having redundant data in
datastructures, part 5,001.
Matt Corallo [Fri, 27 Jan 2023 23:01:39 +0000 (23:01 +0000)]
Move retry-limiting to `retry_payment_with_route`
The documentation for `Retry` is very clear that it counts the
number of failed paths, not discrete retries. When we added
retries internally in `ChannelManager`, we switched to counting
the number if discrete retries, even if multiple paths failed and
were replace with multiple MPP HTLCs.
Because we are now rewriting retries, we take this opportunity to
reduce the places where retries are analyzed, specifically a good
chunk of code is removed from `pay_internal`.
Because we now retry multiple failed paths with one single retry,
we keep the new behavior, updating the docs on `Retry` to describe
the new behavior.
Matt Corallo [Fri, 27 Jan 2023 20:31:10 +0000 (20:31 +0000)]
Test the `RouteParameters` passed to `TestRouter`
`TestRouter` allows us to simply select the `Route` that will be
returned in the next `find_route` call, but it does so without any
checking of what was *requested* for the call. This makes it a
somewhat dubious test utility as it very helpfully ensures we
ignore errors in the routes we're looking for.
Instead, we require users of `TestRouter` pass a `RouteParameters`
to `expect_find_route`, which we compare against the requested
parameters passed to `find_route`.
Matt Corallo [Fri, 27 Jan 2023 20:28:20 +0000 (20:28 +0000)]
Use the `PaymentParameter` final CLTV delta over `RouteParameters`
`PaymentParams` is all about the parameters for a payment, i.e. the
parameters which are static across all the paths of a paymet.
`RouteParameters` is about the information specific to a given
`Route` (i.e. a set of paths, among multiple potential sets of
paths for a payment). The CLTV delta thus doesn't belong in
`RouterParameters` but instead in `PaymentParameters`.
Worse, because `RouteParameters` is built from the information in
the last hops of a `Route`, when we deliberately inflate the CLTV
delta in path-finding, retries of the payment will have the final
CLTV delta double-inflated as it inflates starting from the final
CLTV delta used in the last attempt.
When we calculate the `final_cltv_expiry_delta` to put in the
`RouteParameters` returned via events after a payment failure, we
should re-use the new one in the `PaymentParameters`, rather than
the one that was in the route itself.
Matt Corallo [Fri, 27 Jan 2023 19:24:52 +0000 (19:24 +0000)]
Move the final CLTV delta to `PaymentParameters` from `RouteParams`
`PaymentParams` is all about the parameters for a payment, i.e. the
parameters which are static across all the paths of a paymet.
`RouteParameters` is about the information specific to a given
`Route` (i.e. a set of paths, among multiple potential sets of
paths for a payment). The CLTV delta thus doesn't belong in
`RouterParameters` but instead in `PaymentParameters`.
Worse, because `RouteParameters` is built from the information in
the last hops of a `Route`, when we deliberately inflate the CLTV
delta in path-finding, retries of the payment will have the final
CLTV delta double-inflated as it inflates starting from the final
CLTV delta used in the last attempt.
By moving the CLTV delta to `PaymentParameters` we avoid this
issue, leaving only the sought amount in the `RouteParameters`.