Matt Corallo [Wed, 15 Mar 2023 18:08:35 +0000 (18:08 +0000)]
Bump MSRV to 1.48
1.48.0 was released at the end of 2020, nearly 2.5 years ago. It
has been the rustc available on Debian stable since bullseye,
released in 2021. supporting Debian oldstable for more than a year
seems more than sufficient time to give Debian folks to upgrade,
and bullseye is set to become `oldstable` later this year with the
release of `bookworm`, likely this summer.
This also allows us to clean up our MSRV substantially, having a
single MSRV across our crates rather than a number of separate
ones. Sadly, windows already requires 1.49.
Matt Corallo [Thu, 9 Mar 2023 19:23:58 +0000 (19:23 +0000)]
Correct `outbound_payment` route-fetch calls to pass the hash + ID
`Route::get_route_with_id` exists to provide users payment-specific
data when fetching a route, however we were failing to call it when
we have such info, opting for the simple `get_route` instead. This
defeats the purpose of the additional-metadata method, which we
swap to using here.
Elias Rohrer [Wed, 8 Mar 2023 11:05:57 +0000 (12:05 +0100)]
Support HTTPS Esplora endpoints via new feature
To support HTTPS endpoints, the async HTTP library `reqwest` needs one of
the `-tls` features enabled. While the users could specify this in their
own cargo dependencies, we here provide a new `esplora-async-https`
feature for conveinience.
Elias Rohrer [Tue, 7 Mar 2023 10:19:41 +0000 (11:19 +0100)]
Add `list_channels_by_counterparty` method
While we already provide a `list_channels` method, it could result in
quite a large `Vec<ChannelDetails>`. Here, we provide the means to query
our channels by `counterparty_node_id` and DRY up the code.
Matt Corallo [Tue, 7 Mar 2023 18:06:12 +0000 (18:06 +0000)]
Avoid `poll`ing completed futures in the `background-processor`
`poll`ing completed futures invokes undefined behavior in Rust
(panics, etc, obviously not memory corruption as its not unsafe).
Sadly, in our futures-based version of
`lightning-background-processor` we have one case where we can
`poll` a completed future - if the timer for the network graph
prune + persist completes without a network graph to prune +
persist we'll happily poll the same future over and over again,
likely panicing in user code.
Wilmer Paulino [Wed, 22 Feb 2023 19:46:21 +0000 (11:46 -0800)]
Update same amount and preimage test vector
The amount for HTLC #6 was updated in the spec's test vectors, but the
"same amount and preimage" test vector itself was not updated, even
though the new HTLC amount resulted in a different commitment
transaction, and thus, different signatures.
Wilmer Paulino [Wed, 22 Feb 2023 19:45:43 +0000 (11:45 -0800)]
Add missing test vector for anchors_zero_fee_htlc_tx
Tests the case where only one anchor output exists for the funder in the
commitment transaction due to the remote having a dust balance (in this
case, 0).
Matt Corallo [Sat, 4 Mar 2023 01:16:57 +0000 (01:16 +0000)]
Make `fuzz_threaded_connections` more robust
In `fuzz_threaded_connections`, if one thread is being run while
another is starved, and the running thread manages to call
`timer_tick_ocurred` twice after the starved thread constructs the
inbound connection but before it delivers the first bytes, we'll
receive an immediate error and `unwrap` it, causing failure.
The fix is trivial, simply remove the unwrap and return if we're
already disconnected when we do the initial read.
While we're here, we also reduce the frequency of the
`timer_tick_ocurred` calls to give us a chance to occasionally
deliver some additional messages.
Jeffrey Czyz [Fri, 6 Jan 2023 04:00:31 +0000 (22:00 -0600)]
Guard against division by zero in scorer
Since a node may announce that the htlc_maximum_msat of a channel is
zero, adding one to the denominator in the bucket formulas will prevent
the panic from ever happening. While the routing algorithm may never
select such a channel to score, this precaution may still be useful in
case the algorithm changes or if the scorer is used with a different
routing algorithm.
Matt Corallo [Fri, 3 Mar 2023 20:03:57 +0000 (20:03 +0000)]
Expose the node secret key in `{Phantom,}KeysManager`
When we removed the private keys from the signing interface we
forgot to re-add them in the public interface of our own
implementations, which users may need.
Matt Corallo [Fri, 3 Mar 2023 05:14:04 +0000 (05:14 +0000)]
Do not auto-select the lightning `std` feature from tx-sync crate
We have some downstream folks who are using LDK in wasm compiled
via the normal rust wasm path. To ensure nothing breaks they want
to use `no-std` on the lightning crate, disabling time calls as
those panic. However, the HTTP logic in
`lightning-transaction-sync` gets automatically stubbed out by the
HTTP client crates when targeting wasm via `wasm_bindgen`, so it
works fine despite the std restrictions.
In order to make both work, `lightning-transaction-sync` can remain
`std`, but needs to not automatically enable the `std` flag on the
`lightning` crate, ie by setting `default-features = false`. We do
so here.
Matt Corallo [Fri, 3 Mar 2023 01:24:24 +0000 (01:24 +0000)]
Pass `FailureCode` to `fail_htlc_backwards` by ownership
`FaliureCode` is a trivial enum with no body, so we shouldn't be
passing it by reference. Its sufficiently strange that the Java
bindings aren't happy with it, which is fine, we should just fix it
here.
Matt Corallo [Wed, 22 Feb 2023 02:40:59 +0000 (02:40 +0000)]
Track claimed outbound HTLCs in ChannelMonitors
When we receive an update_fulfill_htlc message, we immediately try
to "claim" the HTLC against the HTLCSource. If there is one, this
works great, we immediately generate a `ChannelMonitorUpdate` for
the corresponding inbound HTLC and persist that before we ever get
to processing our counterparty's `commitment_signed` and persisting
the corresponding `ChannelMonitorUpdate`.
However, if there isn't one (and this is the first successful HTLC
for a payment we sent), we immediately generate a `PaymentSent`
event and queue it up for the user. Then, a millisecond later, we
receive the `commitment_signed` from our peer, removing the HTLC
from the latest local commitment transaction as a side-effect of
the `ChannelMonitorUpdate` applied.
If the user has processed the `PaymentSent` event by that point,
great, we're done. However, if they have not, and we crash prior to
persisting the `ChannelManager`, on startup we get confused about
the state of the payment. We'll force-close the channel for being
stale, and see an HTLC which was removed and is no longer present
in the latest commitment transaction (which we're broadcasting).
Because we claim corresponding inbound HTLCs before updating a
`ChannelMonitor`, we assume such HTLCs have failed - attempting to
fail after having claimed should be a noop. However, in the
sent-payment case we now generate a `PaymentFailed` event for the
user, allowing an HTLC to complete without giving the user a
preimage.
Here we address this issue by storing the payment preimages for
claimed outbound HTLCs in the `ChannelMonitor`, in addition to the
existing inbound HTLC preimages already stored there. This allows
us to fix the specific issue described by checking for a preimage
and switching the type of event generated in response. In addition,
it reduces the risk of future confusion by ensuring we don't fail
HTLCs which were claimed but not fully committed to before a crash.
It does not, however, full fix the issue here - because the
preimages are removed after the HTLC has been fully removed from
available commitment transactions if we are substantially delayed
in persisting the `ChannelManager` from the time we receive the
`update_fulfill_htlc` until after a full commitment signed dance
completes we may still hit this issue. The full fix for this issue
is to delay the persistence of the `ChannelMonitorUpdate` until
after the `PaymentSent` event has been processed. This avoids the
issue entirely, ensuring we process the event before updating the
`ChannelMonitor`, the same as we ensure the upstream HTLC has been
claimed before updating the `ChannelMonitor` for forwarded
payments.
The full solution will be implemented in a later work, however this
change still makes sense at that point as well - if we were to
delay the initial `commitment_signed` `ChannelMonitorUpdate` util
after the `PaymentSent` event has been processed (which likely
requires a database update on the users' end), we'd hold our
`commitment_signed` + `revoke_and_ack` response for two DB writes
(i.e. `fsync()` calls), making our commitment transaction
processing a full `fsync` slower. By making this change first, we
can instead delay the `ChannelMonitorUpdate` from the
counterparty's final `revoke_and_ack` message until the event has
been processed, giving us a full network roundtrip to do so and
avoiding delaying our response as long as an `fsync` is faster than
a network roundtrip.
Jeffrey Czyz [Thu, 5 Jan 2023 17:50:24 +0000 (11:50 -0600)]
Fix scorer panic when available capacity is zero
ProbabilisticScorer takes a ChannelUsage when computing a penalty for a
channel. The formula for calculating the liquidity penalty reduces the
maximum capacity by the amount of in-flight HTLCs (available capacity)
and adds one to prevent division by zero.
However, since the available capacity is passed to
DirectedChannelLiquidity as the capacity, other penalty formulas may use
the available (i.e., reduced) capacity inadvertently. In practice, this
has two ramifications for the historical liquidity penalty computation:
1. The bucket formula doesn't have a consistent denominator for a given
channel.
2. The bucket formula may divide by zero when the in-flight HTLC amount
equals or exceeds the effective capacity.
Fixing this involves only using the available capacity when appropriate.
Matt Corallo [Tue, 28 Feb 2023 21:38:29 +0000 (21:38 +0000)]
Improve `PeerHandler` debug_assertions and checks
This removes two panics from `PeerHandler` which can trivially be
`debug_assert!(false); return Err;`s, and adds another
`debug_assertion` on internal state consistency during disconnect.
Matt Corallo [Thu, 2 Mar 2023 07:50:16 +0000 (07:50 +0000)]
Make waking after a future completes propagates to the next future
In our `wakers`, if we first `notify` a future, which is then
`poll`ed complete, and then `notify` the same waker again before a
new future is fetched, that new future will be marked as
non-complete initially and wait for a third `notify`.
The fix is luckily rather trivial, when we `notify` a future, if it
is completed immediately, simply wipe the future state so that we
look at the pending-notify flag when we generate the next future.
Matt Corallo [Fri, 10 Feb 2023 20:38:14 +0000 (20:38 +0000)]
Reduce macro contents in `expect_pending_htlcs_forwardable*` macros
The `expect_pending_htlcs_forwardable*` macros don't need to be
macros so here we move much of the logic in them to a function and
leave the macro in place to avoid touching every line of code in
the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 301,915 LoC to 295,294 LoC.
Matt Corallo [Fri, 10 Feb 2023 20:17:16 +0000 (20:17 +0000)]
Replace `check_closed_event` macro with a function
The `check_closed_event!()` macro has no reason to be a macro so
here we move its logic to a function and leave the macro in place
to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 309,522 LoC to 301,915 LoC.
Matt Corallo [Fri, 10 Feb 2023 20:07:54 +0000 (20:07 +0000)]
Replace `check_closed_broadcast` macro with a function
The `check_closed_broadcast!()` macro has no reason to be a macro
so here we move its logic to a function and leave the macro in
place to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 313,312 LoC to 309,522 LoC.
Matt Corallo [Fri, 10 Feb 2023 19:57:00 +0000 (19:57 +0000)]
Replace `get_htlc_update_msgs` macro with a function
The `get_htlc_update_msgs!()` macro has no reason to be a macro
so here we move its logic to a function and leave the macro in
place to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 321,985 LoC to 316,856 LoC.
Matt Corallo [Fri, 10 Feb 2023 19:56:42 +0000 (19:56 +0000)]
Replace `get_err_msg` macro with a function
The `get_err_msg!()` macro has no reason to be a macro so here we
move its logic to a function and leave the macro in place to avoid
touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 322,183 LoC to 321,985 LoC.
Matt Corallo [Fri, 10 Feb 2023 19:39:09 +0000 (19:39 +0000)]
Replace `get_revoke_commit_msgs` macro with a function
The `get_revoke_commit_msgs!()` macro has no reason to be a macro
so here we move its logic to a function and leave the macro in
place to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 324,763 LoC to 322,183 LoC.
Matt Corallo [Fri, 10 Feb 2023 19:29:13 +0000 (19:29 +0000)]
Replace `get_route` macro with a function
The `get_route!()` macro has no reason to be a macro so here we
move its logic to a function and leave the macro in place to
avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 326,588 LoC to 324,763 LoC.
Matt Corallo [Fri, 10 Feb 2023 19:08:39 +0000 (19:08 +0000)]
Replace `get_payment_preimage_hash` with a function
The `get_payment_preimage_hash!()` macro has no reason to be a
macro so here we move its logic to a function and leave the macro
in place to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 329,119 LoC to 326,588 LoC.
Matt Corallo [Fri, 10 Feb 2023 18:54:03 +0000 (18:54 +0000)]
Replace `check_added_monitors` with a function
The `check_added_monitors!()` macro has no reason to be a macro so
here we move its logic to a function and leave the macro in place
to avoid touching every line of code in the tests.
This reduces the `--profile=test --lib` `Zpretty=expanded` code
size from 338,710 LoC to 329,119 LoC.
Matt Corallo [Tue, 28 Feb 2023 19:42:31 +0000 (19:42 +0000)]
Mark `IndexedMap` types as `(C-not exported)`
While we could try to expose the type explicitly, we already have
alternative accessors for bindings, and mapping `Hash`, `Ord` and
the other requirements for `IndexedMap` would be a good chunk of
additional work.
Matt Corallo [Tue, 28 Feb 2023 21:28:13 +0000 (21:28 +0000)]
Remove peers from the `node_id_to_descriptor` even without init
When a peer has finished the noise handshake, but has not yet
completed the lightning `Init`-based handshake, they will be
present in the `node_id_to_descriptor` set, even though
`Peer::handshake_complete()` returns false. Thus, when we go to
disconnect such a peer, we must ensure that we remove it from the
descriptor set as well.
Failing to do so caused an `Inconsistent peers set state!` panic in
the C bindings network handler.
John Cantrell [Tue, 28 Feb 2023 16:39:29 +0000 (11:39 -0500)]
Surface bitcoind rpc error code
Users of the RpcClient had no way to access the error code
returned by bitcoind's rpc. We embed a new RpcError struct
as the inner error for the returned io::Error. Users can access
both the code and the message using this inner struct.
Matt Corallo [Thu, 23 Feb 2023 19:06:21 +0000 (19:06 +0000)]
Do not fail to apply RGS updates for removed channels
If we receive a Rapid Gossip Sync update for channels where we are
missing the existing channel data, we should ignore the missing
channel. This can happen in a number of cases, whether because we
received updated channel information via an onion error from an
HTLC failure or because we've partially synced the graph from a
peer over the standard lightning P2P protocol.
Matt Corallo [Sun, 26 Feb 2023 20:22:28 +0000 (20:22 +0000)]
Make sure individual mutexes are constructed on different lines
Our lockdep logic (on Windows) identifies a mutex based on which
line it was constructed on. Thus, if we have two mutexes
constructed on the same line it will generate false positives.
Matt Corallo [Wed, 22 Feb 2023 22:54:38 +0000 (22:54 +0000)]
Disallow taking two instances of the same mutex at the same time
Taking two instances of the same mutex may be totally fine, but it
requires a total lockorder that we cannot (trivially) check. Thus,
its generally unsafe to do if we can avoid it.
To discourage doing this, here we default to panicing on such locks
in our lockorder tests, with a separate lock function added that is
clearly labeled "unsafe" to allow doing so when we can guarantee a
total lockorder.
This requires adapting a number of sites to the new API, including
fixing a bug this turned up in `ChannelMonitor`'s `PartialEq` where
no lockorder was guaranteed.
Matt Corallo [Thu, 2 Feb 2023 22:38:54 +0000 (22:38 +0000)]
Refuse recursive read locks in lockorder testing
Our existing lockorder tests assume that a read lock on a thread
that is already holding the same read lock is totally fine. This
isn't at all true. The `std` `RwLock` behavior is
platform-dependent - on most platforms readers can starve writers
as readers will never block for a pending writer. However, on
platforms where this is not the case, one thread trying to take a
write lock may deadlock with another thread that both already has,
and is attempting to take again, a read lock.
Worse, our in-tree `FairRwLock` exhibits this behavior explicitly
on all platforms to avoid the starvation issue.
Thus, we shouldn't have any special handling for allowing recursive
read locks, so we simply remove it here.
Matt Corallo [Wed, 22 Feb 2023 22:10:46 +0000 (22:10 +0000)]
Don't `per_peer_state` read locks recursively in monitor updating
When handling a `ChannelMonitor` update via the new
`handle_new_monitor_update` macro, we always call the macro with
the `per_peer_state` read lock held and have the macro drop the
per-peer state lock. Then, when handling the resulting updates, we
may take the `per_peer_state` read lock again in another function.
In a coming commit, recursive read locks will be disallowed, so we
have to drop the `per_peer_state` read lock before calling
additional functions in `handle_new_monitor_update`, which we do
here.
Matt Corallo [Fri, 3 Feb 2023 00:46:50 +0000 (00:46 +0000)]
Expect callers to hold read locks before `channel_monitor_updated`
Our existing lockorder tests assume that a read lock on a thread
that is already holding the same read lock is totally fine. This
isn't at all true. The `std` `RwLock` behavior is
platform-dependent - on most platforms readers can starve writers
as readers will never block for a pending writer. However, on
platforms where this is not the case, one thread trying to take a
write lock may deadlock with another thread that both already has,
and is attempting to take again, a read lock.
Worse, our in-tree `FairRwLock` exhibits this behavior explicitly
on all platforms to avoid the starvation issue.
Sadly, a user ended up hitting this deadlock in production in the
form of a call to `get_and_clear_pending_msg_events` which holds
the `ChannelManager::total_consistency_lock` before calling
`process_pending_monitor_events` and eventually
`channel_monitor_updated`, which tries to take the same read lock
again.
Luckily, the fix is trivial, simply remove the redundand read lock
in `channel_monitor_updated`.
Matt Corallo [Fri, 3 Feb 2023 00:33:27 +0000 (00:33 +0000)]
Hold the `total_consistency_lock` while in `outbound_payment` fns
We previously avoided holding the `total_consistency_lock` while
doing crypto operations to build onions. However, now that we've
abstracted out the outbound payment logic into a utility module,
ensuring the state is consistent at all times is now abstracted
away from code authors and reviewers, making it likely to break.
Further, because we now call `send_payment_along_path` both with,
and without, the `total_consistency_lock`, and because recursive
read locks may deadlock, it would now be quite difficult to figure
out which paths through `outbound_payment` need the lock and which
don't.
While it may slow writes somewhat, it's not really worth trying to
figure out this mess, instead we just hold the
`total_consistency_lock` before going into `outbound_payment`
functions.
Matt Corallo [Mon, 6 Feb 2023 22:12:09 +0000 (22:12 +0000)]
Remove the `final_cltv_expiry_delta` in `RouteParameters` entirely
fbc08477e8dcdd8f3f2ada8ca77388b6185febe2 purported to "move" the
`final_cltv_expiry_delta` field to `PaymentParamters` from
`RouteParameters`. However, for naive backwards-compatibility
reasons it left the existing on in place and only added a new,
redundant field in `PaymentParameters`.
It turns out there's really no reason for this - if we take a more
critical eye towards backwards compatibility we can figure out the
correct value in every `PaymentParameters` while deserializing.
We do this here - making `PaymentParameters` a `ReadableArgs`
taking a "default" `cltv_expiry_delta` when it goes to read. This
allows existing `RouteParameters` objects to pass the read
`final_cltv_expiry_delta` field in to be used if the new field
wasn't present.
Matt Corallo [Mon, 6 Feb 2023 21:56:39 +0000 (21:56 +0000)]
Support `ReadableArgs` types across in the TLV struct serialization
This adds `required` support for trait-wrapped reading (e.g. for
objects read via `ReadableArgs`) as well as support for the
trait-wrapped reading syntax across the TLV struct/enum
serialization macros.
Matt Corallo [Mon, 6 Feb 2023 21:43:10 +0000 (21:43 +0000)]
Require a non-0 number of non-empty paths when deserializing routes
When we read a `Route` (or a list of `RouteHop`s), we should never
have zero paths or zero `RouteHop`s in a path. As such, its fine to
simply reject these at deserialization-time. Technically this could
lead to something which we can generate not round-trip'ing
serialization, but that seems okay here.