Wilmer Paulino [Sat, 13 May 2023 01:39:18 +0000 (18:39 -0700)]
Disconnect peers on timer ticks to unblock channel state machine
At times, we've noticed that channels with `lnd` counterparties do not
receive messages we expect to in a timely manner (or at all) after
sending them a `ChannelReestablish` upon reconnection, or a
`CommitmentSigned` message. This can block the channel state machine
from making progress, eventually leading to force closes, if any pending
HTLCs are committed and their expiration is met.
It seems common wisdom for `lnd` node operators to periodically restart
their node/reconnect to their peers, allowing them to start from a fresh
state such that the message we expect to receive hopefully gets sent. We
can achieve the same end result by disconnecting peers ourselves
(regardless of whether they're a `lnd` node), which we opt to implement
here by awaiting their response within two timer ticks.
Wilmer Paulino [Thu, 18 May 2023 19:02:24 +0000 (12:02 -0700)]
Remove unreachable warning message send on UnknownRequiredFeature read
`enqueue_message` simply adds the message to the outbound queue, it
still needs to be written to the socket with `do_attempt_write_data`.
However, since we immediately return an error causing the socket to be
closed, the message never actually gets sent.
Matt Corallo [Tue, 9 May 2023 00:30:33 +0000 (00:30 +0000)]
Never block a thread on the `PeerManager` event handling lock
If thre's a thread currently handling `PeerManager` events, the
next thread which attempts to handle events will block on the first
and then handle events after the first completes. (later threads
will return immediately to avoid blocking more than one thread).
This works fine as long as the user has a spare thread to leave
blocked, but if they don't (e.g. are running with a single-threaded
tokio runtime) this can lead to a full deadlock.
Instead, here, we never block waiting on another event processing
thread, returning immediately after signaling that the first thread
should start over once its complete to ensure all events are
handled.
While this could lead to starvation as we cause one thread to go
around and around and around again, the risk of that should be
relatively low as event handling should be pretty quick, and it's
certainly better than deadlocking.
Duncan Dean [Tue, 9 May 2023 09:44:48 +0000 (11:44 +0200)]
Use a total lockorder for `NetworkGraph`'s `PartialEq` impl
`NetworkGraph`'s `PartialEq` impl before this commit was deadlock-prone.
Similarly to `ChannelMonitor`'s, `PartialEq` impl, we use position in
memory for a total lockorder. This uses the assumption that the objects
cannot move within memory while the inner locks are held.
Antoine Riard [Wed, 9 Nov 2022 00:12:22 +0000 (19:12 -0500)]
Anchor: do not aggregate claim of revoked output
See https://github.com/lightning/bolts/pull/803
This protect the justice claim of counterparty revoked output. As
otherwise if the all the revoked outputs claims are batched in a
single transaction, low-feerate HTLCs transactions can delay our
honest justice claim transaction until BREAKDOWN_TIMEOUT expires.
Elias Rohrer [Wed, 5 Apr 2023 15:08:49 +0000 (17:08 +0200)]
Return error when failing to construc onion messages
Previously, we would panic when failing to construct onion messages in
certain circumstances. Here we opt to always rather error out and don't
panic if something goes wrong during OM packet construction.
Matt Corallo [Thu, 11 May 2023 06:03:57 +0000 (06:03 +0000)]
Replace std's unmaintained bench with criterion
Rather than using the std benchmark framework (which isn't
maintained and is unlikely to get any further maintenance), we swap
for criterion, which at least gets us a variable number of test
runs so our benchmarks don't take forever.
We also fix the RGS benchmark to pass now that the file in use is
stale compared to today's date.
Matt Corallo [Thu, 11 May 2023 05:46:38 +0000 (05:46 +0000)]
Add an additional test/bench for routing larger amounts, score more
When benchmarking our router, we previously only ever tested with
amounts under 1,000 sats, which is an incredibly small amount.
While this ensures we have the maximal number of available channels
to consider, it prevents our scorer from getting exercise across
its range. Further, we only score the immediate path we are
expecting to to send over, and not randomly but rather based on the
amount sent.
Here we try to make the benchmarks a bit more realistic by adding
a new benchmark which attempts to send around 100K sats, which is
a reasonable amount to send over a channel today. We also convert
the scoring data to be randomized based on the seed as well as
attempt to (possibly) find a new route for a much larger value and
score based on that. This potentially allows us to score multiple
potential paths between the source and destination as the large
route-find may return an MPP result.
Matt Corallo [Thu, 11 May 2023 05:34:00 +0000 (05:34 +0000)]
Unify route benchmarking with route tests
There's a few route tests which do the same thing as the benchmarks
as they're also a good test. However, they didn't share code, which
is somewhat wasteful, so we fix that here.
henghonglee [Sat, 6 May 2023 18:01:22 +0000 (11:01 -0700)]
Score's FeeParams as passed-in params on Routefinding functions
This PR aims to create a "stateless" scorer. Instead of passing
in fee params at construction-time, we want to parametrize the
scorer with an associated "parameter" type, which is then
passed to the router function itself, and allows passing
different parameters per route-finding call.
Custom message handlers may need to set feature bits that are unknown to
LDK. Provide Features::set_required_custom_bit and
Features::set_optional_custom_bit to allow for this.
Each message handler provides which features it supports. A custom
message handler may support unknown features. Therefore, these features
should be checked against instead of the features known by LDK.
Additionally, fail the connection if the peer requires features unknown
to the handler. The peer should already fail the connection in the
latter case.
The purpose of this payload is to ensure we retry restored packages on a
`ChannelMonitor` that has upgraded from a version that previously did
not have such retry logic. We can verify this works by checking whether
a restored package has a `height_timer` of `None` upon deserializing the
monitor payload.
In the previous commit, we added a helper that constructs blocks
whenever tests demand blocks be connected. This helper moved towards
having all connected blocks have a version of 0x2000_0000 (also known as
NO_SOFT_FORK_SIGNALLING). However, previously, it was possible for some
blocks to be connected with a slighty different version: 0x0200_0000,
resulting in different block hashes.
This block hash divergence prompted a failure in this test when
`ConnectStyle::HighlyRedundantTransactionsFirstSkippingBlocks` is used
for `nodes[0]`, since this block connection style reconfirms
transactions redundantly and the serialized monitor payload kept a
reference to the hash of the block with version 0x0200_0000, when it
should be expecting one with version 0x2000_0000.
`rust-bitcoin v0.30.0` introduces concrete variants for data members of
block `Header`s. To avoid having to update these across every use, we
introduce new helpers to create dummy blocks and headers, such that the
update process is a bit more straight-forward.
Check difficulty transition against `Target` instead of `Work`
`rust-bitcoin v0.30.0` made some changes in this area that no longer
allow us to work with the previously exposed `U256` type. While `Work`
and `Target` (they're inverses of each other) essentially represent the
same concept, it makes more sense from their API's perspective to only
expose difficulty transitions and adjustments on `Target`s.
This makes much clearer at sites generating such events that they
will be lost on restart, to reduce risk of bugs creeping in due to
lost monitor updates.
In d4810087c1 we added logic to apply `ChannelMonitorUpdate`s which
were a part of a channel closure async via a background queue to
address some startup issues. When we did that we persisted those
updates to ensure we replayed them when starting next time.
However, there was no reason to - if we persisted and then
restarted even without those monitor updates we'd find a monitor
without a channel, which we'd tell to broadcast the latest
commitment transaction to force-close.
Since adding that logic, we've used the same background queue for
several purposes.
Elias Rohrer [Fri, 5 May 2023 09:26:50 +0000 (11:26 +0200)]
Make `lightning-transaction-sync` compat notice a bit more explicit
As `lightning-transaction-sync` was introduced with 0.0.114 and depended
on prior changes in the same release cycle we deemed it reasonable to
omit the implicitly limited backwards compatibility.
It however turns out this might be confusing to users copy/pasting the
codebase. Here we therefore spell out the implicit dependency on 0.0.114
and above.
Fix onion messages of size BIG_PACKET_HOP_DATA_LEN
This was previously broken and would result in an invalid HMAC error, because
we had a hardcoded assumption that OM hop data would always be of size 1300.
Jeffrey Czyz [Fri, 5 May 2023 18:38:50 +0000 (13:38 -0500)]
Add Features::requires_unknown_bits_from
When checking features, rather than checking against which features LDK
knows about, it is more useful to check against a peer's features. Add
Features::requires_unknown_bits_from such that the given features are
used instead.
CustomMessageHandler implementations may need to advertise support for
features. Add methods to CustomMessageHandler to provide these and
combine them with features from other message handlers.
The `lightning-custom-message` crate will need access to Features::or in
order combine features of a composite handler. Expose this via a
core::ops::BitOr implementation.
Duncan Dean [Thu, 20 Oct 2022 20:56:37 +0000 (22:56 +0200)]
Add message structs required for dual-funded channels
This is the first of a set of PRs to enable the experimental dual-funded
channels feature using interactive transaction construction. This allows
both the channel initiator and channel acceptor to contribute funds
towards the channel.
Matt Corallo [Fri, 5 May 2023 03:33:54 +0000 (03:33 +0000)]
Document when `PaymentPathSuccessful::payment_hash` is filled in.
The `payment_hash` field in `PaymentPathSuccessful` is always
`Some` as long as the pening payment tracker has a `payment_hash`,
which is true for all `Pending` payments as well as all `Fulfilled`
payments starting with the commit which added
`PaymentPathSuccessful` - 3b5c370b404e2f5a8f3c35093b97406f149a9340c177c05252574083d68df0da.
Matt Corallo [Fri, 5 May 2023 00:13:25 +0000 (00:13 +0000)]
Mention lnd's SCB feature in the corresponding error message
It's a bit confusing when we see only "Peer sent a garbage
channel_reestablish" when a peer uses lnd's SCB feature to ask us
to broadcast the latest state. This updates the error message to be
a bit clearer.
Wilmer Paulino [Thu, 4 May 2023 22:16:17 +0000 (15:16 -0700)]
Prevent ChannelForceClosed monitor update error after detecting spend
If we detected a spend for a channel onchain prior to handling its
`ChannelForceClosed` monitor update, we'd log a concerning error
message and return an error unnecessarily. The channel has already been
closed, so handling the `ChannelForceClosed` monitor update at this
point should be a no-op.
Groundwork for refactoring PaymentParams::Hints to ::Payee
Minor changes in preparation for supporting route blinding in
PaymentParameters. In the next commit, we'll be moving more
unblinded-payee-specific fields from the top level parameters into the clear
enum variant.
`<E as serde::de::Error>::custom()` accepts any `T: Display`, not just
`String`. Therefore it accepts `Arguments<'_>` too so we can use
`format_args!()` instead of `format!()`.
See https://github.com/lightningdevkit/rust-lightning/pull/2187#discussion_r1168781355
Matt Corallo [Sat, 29 Apr 2023 18:45:59 +0000 (18:45 +0000)]
Expose a trait impl'd for all `PeerManager` for use as a bound
A while back, in tests, we added a `AChannelManager` trait, which
is implemented for all `ChannelManager`s, and can be used as a
bound when we need a `ChannelManager`, rather than having to
duplicate all the bounds of `ChannelManager` everywhere.
Here we do the same thing for `PeerManager`, but make it public and
use it to clean up `lightning-net-tokio` and
`lightning-background-processor`.
We should likely do the same for `AChannelManager`, but that's left
as a followup.
Matt Corallo [Fri, 17 Mar 2023 04:55:30 +0000 (04:55 +0000)]
Store + process pending `ChannelMonitorUpdate`s in `Channel`
The previous commits set up the ability for us to hold
`ChannelMonitorUpdate`s which are pending until we're ready to pass
them to users and have them be applied. However, if the
`ChannelManager` is persisted while we're waiting to give the user
a `ChannelMonitorUpdate` we'll be confused on restart - seeing our
latest `ChannelMonitor` state as stale compared to our
`ChannelManager` - a critical error.
Luckily the solution is trivial, we simply need to store the
pending `ChannelMonitorUpdate` state and load it with the
`ChannelManager` data, allowing stale monitors on load as long as
we have the missing pending updates between where we are and the
latest `ChannelMonitor` state.
Matt Corallo [Thu, 16 Mar 2023 03:33:20 +0000 (03:33 +0000)]
Handle `EventCompletionAction`s after events complete
This adds handling of the new `EventCompletionAction`s after
`Event`s are handled, letting `ChannelMonitorUpdate`s which were
blocked fly after a relevant `Event`.
Matt Corallo [Fri, 28 Apr 2023 04:24:25 +0000 (04:24 +0000)]
Track an `EventCompletionAction` for after an `Event` is processed
This will allow us to block `ChannelMonitorUpdate`s on `Event`
processing in the next commit.
Note that this gets dangerously close to breaking forwards
compatibility - if we have an `Event` with an
`EventCompletionAction` tied to it, we persist a new, even, TLV in
the `ChannelManager`. Hopefully this should be uncommon, as it
implies an `Event` was delayed until after a full round-trip to a
peer.
Matt Corallo [Wed, 15 Mar 2023 23:16:06 +0000 (23:16 +0000)]
Allow holding `ChannelMonitorUpdate`s until later, completing one
In the coming commits, we need to delay `ChannelMonitorUpdate`s
until future actions (specifically `Event` handling). However,
because we should only notify users once of a given
`ChannelMonitorUpdate` and they must be provided in-order, we need
to track which ones have or have not been given to users and, once
updating resumes, fly the ones that haven't already made it to
users.
To do this we simply add a `bool` in the `ChannelMonitorUpdate` set
stored in the `Channel` which indicates if an update flew and
decline to provide new updates back to the `ChannelManager` if any
updates have their flown bit unset.
Further, because we'll now by releasing `ChannelMonitorUpdate`s
which were already stored in the pending list, we now need to
support getting a `Completed` result for a monitor which isn't the
only pending monitor (or even out of order), thus we also rewrite
the way monitor updates are marked completed.