Matt Corallo [Thu, 21 Jul 2022 21:38:00 +0000 (21:38 +0000)]
Avoid blanket implementing FeeEstimator for Deref<FeeEstimator>
This simplifies things for bindings (and, to some extent,
downstream users) by exploiting the fact that we can always "clone"
a reference to a struct by dereferencing and then creating a new
reference.
Matt Corallo [Wed, 13 Jul 2022 16:52:27 +0000 (16:52 +0000)]
Use a separate (non-trait) fee-estimation fn in LowerBoundedEstimator
This should make it somewhat more difficult to accidentally use a
straight fee estimator when we actually want a
LowerBoundedFeeEstimator by not having the types be exchangeable at
all.
Matt Corallo [Fri, 22 Jul 2022 00:32:17 +0000 (00:32 +0000)]
Remove scary disconenct warnings on PeerManager new connection fns
In 4703d4e72565ddfd150b9368ea036f4973fd7590 we changed
PeerManager::socket_disconnected to no longer require that sockets
which the PeerManager decided to disconnect not be disconnected.
However, we forgot to remove the scary warnings on the
`new_{inbound,outbound}_connection` functions which warned of the
old behavior.
Matt Corallo [Tue, 12 Apr 2022 17:32:35 +0000 (17:32 +0000)]
Expand lockorder testing to look at mutexes, not specific instances
Our existing lockorder inversion checks look at specific instances
of mutexes rather than the general mutex itself. This changes that
behavior to look at the instruction pointer at which a mutex was
created and treat all mutexes which were created at the same
location as equivalent.
This allows us to detect lockorder inversions which occur across
tests, though it does substantially reduce parallelism during test
runs.
Matt Corallo [Fri, 15 Jul 2022 16:18:42 +0000 (16:18 +0000)]
Construct all ChannelMonitor mutexes in the same function
When we add lockorder detection based on mutex construction site
rather than mutex instance in the next commit, ChannelMonitor's
PartialEq implementation causes spurious failures. This is caused
by the lockorder detection logic considering the ChannelMonitor
inner mutex to be two distinct mutexes - one when monitors are
deserialized and one when monitors are created fresh. Instead, we
attempt to tell the lockorder detection logic that they are the
same by ensuring they're constructed in the same place - in this
case a util method.
Matt Corallo [Tue, 12 Jul 2022 21:13:30 +0000 (21:13 +0000)]
Reduce default max_channel_saturation_power_of_half to 2 (max 1/4)
Saturating a channel beyond 1/4 of its capacity seems like a more
reasonable threshold for avoiding a path than 1/2, especially given
we should still be willing to send a payment with a lower
saturation limit if it comes to that.
This requires an (obvious) change to some router tests, but also
requires a change to the `fake_network_test`, opting to simply
remove some over-limit test code there - `fake_network_test` was
our first ever functional test, and while it worked great to ensure
LDK worked at all on day one, we now have a rather large breadth
of functional tests, and a broad "does it work at all" test is no
longer all that useful.
Matt Corallo [Tue, 12 Jul 2022 20:35:36 +0000 (20:35 +0000)]
Make route path selection optimize strictly for `cost / amount`
Currently, after we've selected a number of candidate paths, we
construct a route from a random set of paths repeatedly, and then
select the route with the lowest total cost. In the vast majority
of cases this ends up doing a bunch of additional work in order to
select the path(s) with the total lowest cost, with some vague
attempt at randomization that doesn't actually work.
Instead, here, we simply sort available paths by `cost / amount`
and select the top paths. This ends up in practice having the same
end result with substantially less complexity. In some rare cases
it gets a better result, which also would have been achieved
through more random trials. This implies there may in such cases be
a potential privacy loss, but not a substantial one, given our path
selection is ultimately mostly deterministic in many cases (or, if
it is not, then privacy is achieved through randomization at the
scorer level).
Matt Corallo [Tue, 12 Jul 2022 20:30:00 +0000 (20:30 +0000)]
Fix tracking of collected value across pathfinding iterations
If we end up "paying" for an `htlc_minimum_msat` with fees, we
increment `already_collected_value_msat` by more than the amount
of the path that we collected (who's `value_contribution_msat` is
higher than the total payment amount, despite having been reduced
down to the payment amount).
This throws off our total value collection target, though in the
coming commit(s) it would also throw off our path selection
calculations.
Matt Corallo [Thu, 7 Jul 2022 17:37:22 +0000 (17:37 +0000)]
Change default "impossibility penalty" to one Bitcoin
In general we should avoid taking paths that we are confident will
not work as much possible, but we should be willing to try each
payment at least once, even if its over a channel that failed
recently. A full Bitcoin penalty for such a channel seems
reasonable - lightning fees are unlikely to ever reach that point
so such channels will be scored much worse than any other potential
path, while still being below `u64::max_value()`.
Matt Corallo [Wed, 6 Jul 2022 21:05:54 +0000 (21:05 +0000)]
Make the `ProbabilisticScorer` impossibility penalty configurable
When we consider sending an HTLC over a given channel impossible
due to our current knowledge of the channel's liquidity, we
currently always assign a penalty of `u64::max_value()`. However,
because we now refuse to retry a payment along the same path in
the router itself, we can now make this value configurable. This
allows users to have a relatively high knowledge decay interval
without the side-effect of refusing to try the only available path
in cases where a channel is intermittently available.
Matt Corallo [Wed, 6 Jul 2022 19:17:40 +0000 (19:17 +0000)]
Track channels which a given payment part failed to traverse
When an HTLC fails, we currently rely on the scorer learning the
failed channel and assigning an infinite (`u64::max_value()`)
penalty to the channel so as to avoid retrying over the exact same
path (if there's only one available path). This is common when
trying to pay a mobile client behind an LSP if the mobile client is
currently offline.
This leads to the scorer being overly conservative in some cases -
returning `u64::max_value()` when a given path hasn't been tried
for a given payment may not be the best decision, even if that
channel failed 50 minutes ago.
By tracking channels which failed on a payment part level and
explicitly refusing to route over them we can relax the
requirements on the scorer, allowing it to make different decisions
on how to treat channels that failed relatively recently without
causing payments to retry the same path forever.
This does have the drawback that it could allow two separate part
of a payment to traverse the same path even though that path just
failed, however this should only occur if the payment is going to
fail anyway, at least as long as the scorer is properly learning.
Jeffrey Czyz [Tue, 14 Jun 2022 23:05:25 +0000 (18:05 -0500)]
Look-up functions for ReadOnlyNetworkGraph
ReadOnlyNetworkGraph uses BTreeMap to store its nodes and channels, but
these data structures are not supported by C bindings. Expose look-up
functions on these maps in lieu of such support.
Matt Corallo [Wed, 13 Jul 2022 19:33:25 +0000 (19:33 +0000)]
Rip out dependabot - its worse than useless - its annoying
Dependabot has a ton of issues with its rust integration that makes
it wholly useless, and very annoying:
* It has no concept of MSRV, opening PRs that are not going to pass
CI.
* It has no concept of patch-level - if we depend on tokio 1.X,
that means any version of tokio > 1.X, but dependabot insists on
opening a PR to "update us" to tokio 1.X + 1, even though it
doesn't impact what version of our users use (and often violates
MSRV).
* It has no concept of dependencies that rely on each other,
causing it to open a PR to update us to bitcoin_hashes X + 1,
even though we're still depending on rust-bitcoin Y which
depends on bitcoin_hashes X, causing build failure.
* It hogs CI resources, getting CI run twice, once for the branch
once for the PR.
* It creates branches directly on the rust-lightning repo, making
it look like the work is somehow connected to the
lightningdevkit project, even though it isn't, and spamming the
local clones of project contributors.
At the end of the day, dependabot has never meaningfully
contributed to notifying us of an important dependency, and,
really, we don't have enough dependencies for it to matter.
Matt Corallo [Tue, 12 Apr 2022 17:28:15 +0000 (17:28 +0000)]
Fix some test theoretical lock inversions
In the next commit we add lockorder testing based on the line each
mutex was created on rather than the particular mutex instance.
This causes some additional test failure because of lockorder
inversions for the same mutex across different tests, which is
fixed here.
Matt Corallo [Fri, 8 Jul 2022 19:48:00 +0000 (19:48 +0000)]
Relax the channel saturation limit if we can't find enough paths
In order to avoid failing to find paths due to the new channel
saturation limit, if we fail to find enough paths, we simply
disable the saturation limit for further path finding iterations.
Because we can now increase the maximum sent over a given channel
during routefinding, we may now generate redundant paths for the
same payment. Because this is wasteful in the network, we add an
additional pass during routefinding to merge redundant paths.
Note that two tests which previously attempted to send exactly the
available liquidity over a channel which charged an absolute fee
need updating - in those cases the router will first collect a path
that is saturation-limited, then attempt to collect a second path
without a saturation limit while stil honoring the existing
utilized capacity on the channel, causing failure as the absolute
fee must be included.
Matt Corallo [Fri, 8 Jul 2022 18:26:06 +0000 (18:26 +0000)]
Avoid saturating channels before we split payments
Currently we only opt to split a payment into an MPP if we have
completely and totally used a channel's available capacity (up to
the announced htlc_max or on-chain capacity, whichever is lower).
This is obviously destined to fail as channels are unlikely to have
their full capacity available.
Here we do the minimum viable fix by simply limiting channels to
only using up to a configurable power-of-1/2. We default this new
configuration knob to 1 (1/2 of the channel) so as to avoid a
substantial change but in the future we may consider changing this
to 2 (1/4) or even 3 (1/8).
Duncan Dean [Wed, 29 Jun 2022 12:59:07 +0000 (14:59 +0200)]
Add LowerBoundedFeeEstimator to set FeeEstimator min rates
`LowerBoundedFeeEstimator` is a wrapper for `Deref`s to `FeeEstimator`s
that limits the get_est_sat_per_1000_weight() method to no less than 253
sats/kW.
Matt Corallo [Fri, 8 Jul 2022 01:16:05 +0000 (01:16 +0000)]
Avoid panicking on wallclock time going backwards across restart
Because we serialize `Instant`s using wallclock time in
`ProbabilisticScorer`, if time goes backwards across restarts we
may end up with `Instant`s in the future, which causes rustc prior
to 1.60 to panic when calculating durations. Here we simply avoid
this by setting the time to `now` if we get a time in the future.
As the map values are no longer only `channel_id`s, but also a
`counterparty_node_id`s, the map is renamed to better correspond to
whats actually stored in the map.
Elias Rohrer [Fri, 24 Jun 2022 10:00:20 +0000 (12:00 +0200)]
Add `send_probe` and introduce probing cookies
When we send payment probes, we generate the [`PaymentHash`] based on a
probing cookie secret and a random [`PaymentId`]. This allows us to
discern probes from real payments, without keeping additional state.
Elias Rohrer [Fri, 24 Jun 2022 08:53:49 +0000 (10:53 +0200)]
Refactor `max_mpp_path_count` to `max_path_count`
Using this field just for MPP doesn't make sense when it could
intuitively also be used to indicate single-path payments. We therefore
rename `max_mpp_path_count` to `max_path_count` and make sure that a
value of 1 ensures MPP is not even tried.
Matt Corallo [Fri, 1 Jul 2022 21:14:19 +0000 (21:14 +0000)]
Do not execute the default_value expr until we need it in TLV deser
This fixes an insta-panic in `ChannelMonitor` deserialization where
we always `unwrap` a previous value to determine the default value
of a later field. However, because we always ran the `unwrap`
before the previous field is read, we'd always panic.
The fix is rather simple - use a `OptionDeserWrapper` for
`default_value` fields and only fill in the default value if no
value was read while walking the TLV stream.
The only complexity comes from our desire to support
`read_tlv_field` calls that use an explicit field rather than an
`Option` of some sort, which requires some statement which can
assign both an `OptionDeserWrapper<T>` variable and a `T` variable.
We settle on `x = t.into()` and implement `From<T> for
OptionDeserWrapper<T>` which works, though it requires users to
specify types explicitly due to Rust determining expression types
prior to macro execution, completely guessing with no knowlege for
integer expressions (see
https://github.com/rust-lang/rust/issues/91369).
Elias Rohrer [Mon, 4 Jul 2022 07:36:06 +0000 (09:36 +0200)]
Allow to set manual node penalties
A user might want to explicitly penalize or prioritize a particular
node. We now allow them to do so by specifying a manual penalty
override for a given node that is then returned by the scorer.
Matt Corallo [Tue, 14 Jun 2022 10:54:39 +0000 (10:54 +0000)]
Fix spurious panic on bogus funding txn that confirm and are spent
In c02b6a3807488e1943d79792c5ac0ee52530b971 we moved the
`payment_preimage` copy from inside the macro which only runs if we
are spending an output we know is an HTLC output to doing it for
any script that matches our expected length. This can panic if an
inbound channel is created with a bogus funding transaction that
has a witness program of the HTLC-Success/-Offered length but which
does not have a second-to-last witness element which is 32 bytes.
Luckily this panic is relatively simple for downstream users to
work around - if an invalid-length-copy panic occurs, simply remove
the ChannelMonitor from the bogus channel on startup and run
without it. Because the channel must be funded by a bogus script in
order to reach this panic, the channel will already have closed by
the time the funding transaction is spent, and there can be no
local funds in such a channel, so removing the `ChannelMonitor`
wholesale is completely safe.
In order to test this we have to disable an in-line assertion that
checks that our transactions match expected scripts which we do by
checking for the specific bogus script that we now use in
`test_invalid_funding_tx`.
Matt Corallo [Wed, 29 Jun 2022 17:41:30 +0000 (17:41 +0000)]
Have `find_route` take a `NetworkGraph` instead of a `ReadOnly` one
Because downstream languages are often garbage-collected, having
the user directly allocate a `ReadOnlyNetworkGraph` and pass a
reference to it to `find_route` often results in holding a read
lock long in excess of the `find_route` call. Worse, some languages
(like JavaScript) tend to only garbage collect when other code is
not running, possibly leading to deadlocks.
Elias Rohrer [Fri, 24 Jun 2022 16:33:15 +0000 (18:33 +0200)]
Add anti-probing penalty to `ProbabilisticScorer`
Currently, channel balances may be rather easily discovered through
probing. This however poses a privacy risk, since the analysis of
balance changes over adjacent channels could in the worst case empower an adversary to
mount an end-to-end deanonymization attack, i.e., track who payed whom.
The penalty added here is applied so we prefer nodes with a smaller `htlc_maximum_msat`, which makes
balance discovery attacks harder to execute. As this improves privacy network-wide, we
treat such nodes preferentially and hence create an incentive to restrict
`htlc_maximum_msat`.
Matt Corallo [Thu, 23 Jun 2022 21:25:17 +0000 (21:25 +0000)]
Panic if we're running with outdated state instead of force-closing
When we receive a `channel_reestablish` with a `data_loss_protect`
that proves we're running with a stale state, instead of
force-closing the channel, we immediately panic. This lines up with
our refusal to run if we find a `ChannelMonitor` which is stale
compared to our `ChannelManager` during `ChannelManager`
deserialization. Ultimately both are an indication of the same
thing - that the API requirements on `chain::Watch` were violated.
In the "running with outdated state but ChannelMonitor(s) and
ChannelManager lined up" case specifically its likely we're running
off of an old backup, in which case connecting to peers with
channels still live is explicitly dangerous. That said, because
this could be an operator error that is correctable, panicing
instead of force-closing may allow for normal operation again in
the future (cc #1207).
In any case, we provide instructions in the panic message for how
to force-close channels prior to peer connection, as well as a note
on how to broadcast the latest state if users are willing to take
the risk.
Note that this is still somewhat unsafe until we resolve #1563.
Matt Corallo [Thu, 23 Jun 2022 20:25:58 +0000 (20:25 +0000)]
Add ChannelManager methods to force close without broadcasting
If a user restores from a backup that they know is stale, they'd
like to force-close all of their channels (or at least the ones
they know are stale) *without* broadcasting the latest state,
asking their peers to do so instead. This simply adds methods to do
so, renaming the existing `force_close_channel` and
`force_close_all_channels` methods to disambiguate further.
Elias Rohrer [Sat, 18 Jun 2022 14:24:37 +0000 (16:24 +0200)]
Allow nodes to be avoided during pathfinding
Users may want to - for whatever reasons - prevent payments to be routed
over certain nodes. This change therefore allows to add `NodeId`s to a
list of banned nodes, which then will be avoided during path finding.
Enable simultaneous deserialization+decryption of a ChaChaPoly stream
In the upcoming onion messages PR, this will allow us to avoid decrypting onion
message encrypted data in an intermediate Vec before decoding it. Instead we
decrypt and decode it at the same time using this new ChaChaPolyReadAdapter object.
In doing so, we need to adapt the decode_tlv_stream macro such that it will
decode a LengthReadableArgs, which is a new trait as well. This trait is
necessary because ChaChaPoly needs to know the total length ahead of time to
separate out the tag at the end.
In the upcoming onion messages PR, this will allow us to avoid encoding onion
message encrypted data into an intermediate Vec before encrypting it. Instead
we encode and encrypt at the same time using this new ChaChaPolyWriteAdapter object.
Matt Corallo [Sat, 30 Apr 2022 20:30:00 +0000 (20:30 +0000)]
Don't fail HTLCs in revoked commitment txn until we spend them
When we see a counterparty revoked commitment transaction on-chain
we shouldn't immediately queue up HTLCs present in it for
resolution until we have spent the HTLC outputs in some kind of
claim transaction.
In order to do so, we first have to change the
`fail_unbroadcast_htlcs!()` call to provide it with the HTLCs which
are present in the (revoked) commitment transaction which was
broadcast. However, this is not sufficient - because all of those
HTLCs had their `HTLCSource` removed when the commitment
transaction was revoked, we also have to update
`fail_unbroadcast_htlcs` to check the payment hash and amount when
the `HTLCSource` is `None`.
Somewhat surprisingly, several tests actually explicitly tested for
the old behavior, which required amending to pass with the new
changes.
Finally, this adds a debug assertion when writing `ChannelMonitor`s
to ensure `HTLCSource`s do not leak.
Wilmer Paulino [Wed, 15 Jun 2022 23:33:23 +0000 (16:33 -0700)]
Allow forwarding HTLCs that were constructed for previous config
This is mostly motivated by the fact that payments may happen while the
latest `ChannelUpdate` indicating our new `ChannelConfig` is still
propagating throughout the network. By temporarily allowing the previous
config, we can help reduce payment failures across the network.
Wilmer Paulino [Thu, 16 Jun 2022 23:24:42 +0000 (16:24 -0700)]
Track previous ChannelConfig and expire after enough ticks
We do this to prevent payment failures while the `ChannelUpdate` for the
new `ChannelConfig` still propagates throughout the network. In a follow
up commit, we'll honor forwarding HTLCs that were constructed based on
either the previous or current `ChannelConfig`.
To handle expiration (when we should stop allowing the previous config),
we rely on the ChannelManager's `timer_tick_occurred` method. After
enough ticks, the previous config is cleared from memory, and only the
current config applies moving forward.
Wilmer Paulino [Tue, 14 Jun 2022 23:41:32 +0000 (16:41 -0700)]
Expose API to update a channel's ChannelConfig
A new `update_channel_config` method is exposed on the `ChannelManger`
to update the `ChannelConfig` for a set of channels atomically. New
`ChannelUpdate` events are generated for each eligible channel.
Note that as currently implemented, a buggy and/or
auto-policy-management client could spam the network with updates as
there is no rate-limiting in place. This could already be done with
`broadcast_node_announcement`, though users are less inclined to update
that as frequently as its data is mostly static.
Wilmer Paulino [Tue, 14 Jun 2022 23:40:12 +0000 (16:40 -0700)]
Expose ChannelConfig within ChannelDetails
As we prepare to expose an API to update a channel's ChannelConfig,
we'll also want to expose this struct to consumers such that they have
insights into the current ChannelConfig applied for each channel.