Matt Corallo [Fri, 10 Dec 2021 00:28:24 +0000 (00:28 +0000)]
Expose an event when a payment has failed and retries complete
When a payment fails, a payer needs to know when they can consider
a payment as fully-failed, and when only some of the HTLCs in the
payment have failed. This isn't possible with the current event
scheme, as discovered recently and as described in the previous
commit.
This adds a new event which describes when a payment is fully and
irrevocably failed, generating it only after the payment has
expired or been marked as expired with
`ChannelManager::mark_retries_exceeded` *and* all HTLCs for it
have failed. With this, a payer can more simply deduce when a
payment has failed and use that to remove payment state or
finalize a payment failure.
Matt Corallo [Fri, 3 Dec 2021 19:57:37 +0000 (19:57 +0000)]
Add a variant to `PendingOutboundPayment` for retries-exceeded
When a payer gives up trying to retry a payment, they don't know
for sure what the current state of the event queue is.
Specifically, they cannot be sure that there are not multiple
additional `PaymentPathFailed` or even `PaymentSuccess` events
pending which they will see later. Thus, they have a very hard
time identifying whether a payment has truly failed (and informing
the UI of that fact) or if it is still pending. See [1] for more
information.
In order to avoid this mess, we will resolve it here by having the
payer give `ChannelManager` a bit more information - when they
have given up on a payment - and using that to generate a
`PaymentFailed` event when all paths have failed.
This commit adds the neccessary storage and changes for the new
state inside `ChannelManager` and a public method to mark a payment
as failed, the next few commits will add the new `Event` and use
the new features in our `PaymentRetrier`.
Jeffrey Czyz [Fri, 3 Dec 2021 19:04:58 +0000 (13:04 -0600)]
Ensure ChannelManager methods are idempotent
During event handling, ChannelManager methods may need to be called as
indicated in the Event documentation. Ensure that these calls are
idempotent for the same event rather than panicking. This allows users
to persist events for later handling without needing to worry about
processing the same event twice (e.g., if ChannelManager is not
persisted but the events were, the restarted ChannelManager would return
some of the same events).
Matt Corallo [Mon, 29 Nov 2021 21:36:12 +0000 (21:36 +0000)]
Drop `MonitorUpdateErr` in favor of opaque errors.
We don't expect users to ever change behavior based on the string
contained in a `MonitorUpdateErr`, except log it, so there's little
reason to not just log it ourselves and return a `()` for errors.
We do so here, simplifying the callsite in `ChainMonitor` as well.
Matt Corallo [Sun, 17 Oct 2021 21:28:50 +0000 (21:28 +0000)]
Always return failure in `update_monitor` after funding spend
Previously, monitor updates were allowed freely even after a
funding-spend transaction confirmed. This would allow a race
condition where we could receive a payment (including the
counterparty revoking their broadcasted state!) and accept it
without recourse as long as the ChannelMonitor receives the block
first, the full commitment update dance occurs after the block is
connected, and before the ChannelManager receives the block.
Obviously this is an incredibly contrived race given the
counterparty would be risking their full channel balance for it,
but its worth fixing nonetheless as it makes the potential
ChannelMonitor states simpler to reason about.
The test in this commit also tests the behavior changed in the
previous commit.
Matt Corallo [Mon, 6 Dec 2021 00:12:35 +0000 (00:12 +0000)]
Remove OnionV2 parsing support
OnionV2s don't (really) work on Tor anymore anyway, and the field
is set for removal in the BOLTs [1]. Sadly because of the way
addresses are parsed we have to continue to understand that type 3
addresses are 12 bytes long. Thus, for simplicity we keep the
`OnionV2` enum variant around and just make it an opaque 12 bytes,
with the documentation updated to note the deprecation.
Jeffrey Czyz [Wed, 1 Dec 2021 04:57:32 +0000 (22:57 -0600)]
Fix shift overflow in Scorer::channel_penalty_msat
An unchecked shift of more than 64 bits on u64 values causes a shift
overflow panic. This may happen if a channel is penalized only once and
(1) is not successfully routed through and (2) after 64 or more half
life decays. Use a checked shift to prevent this from happening.
Jeffrey Czyz [Wed, 1 Dec 2021 00:27:08 +0000 (18:27 -0600)]
Decay channel failure penalty upon success
If a payment failed to route through a channel, a penalty is applied to
the channel in the future when finding a route. This penalty decays over
time. Immediately decay the penalty by one half life when a payment is
successfully routed through the channel.
Jeffrey Czyz [Tue, 30 Nov 2021 23:16:05 +0000 (17:16 -0600)]
Score successful payment paths
Expand the Score trait with a payment_path_successful function for
scoring successful payment paths. Called by InvoicePayer's EventHandler
implementation when processing PaymentPathSuccessful events. May be used
by Score implementations to revert any channel penalties that were
applied by calls to payment_path_failed.
Matt Corallo [Mon, 29 Nov 2021 20:05:35 +0000 (20:05 +0000)]
Fix regression when reading `Event::PaymentReceived` in some cases
For some reason rustc was deciding on a type for the `Option` being
deserialized for us as `_user_payment_id`. This really, really,
absolutely should have been a compile failure - the type (with
methods called on it!) was ambiguous! Instead, rustc seems to have
been defaulting to `Option<()>`, causing us to read zero of the
eight bytes in the `user_payment_id` field, which returns an
`Err(InvalidValue)` error as TLVs must always be read fully.
This should likely be reported to rustc as its definitely a bug,
but I cannot seem to cause the same error on any kinda of
vaguely-minimized version of the same code.
Matt Corallo [Tue, 9 Nov 2021 21:25:33 +0000 (21:25 +0000)]
Explicitly support counterparty setting 0 channel reserve
A peer providing a channel_reserve_satoshis of 0 (or less than our
dust limit) is insecure, but only for them. Because some LSPs do it
with some level of trust of the clients (for a substantial UX
improvement), we explicitly allow it. Because its unlikely to
happen often in normal testing, we test it explicitly here.
Matt Corallo [Sun, 17 Oct 2021 21:26:04 +0000 (21:26 +0000)]
Continue after a single failure in `ChannelMonitor::update_monitor`
`ChannelMonitorUpdate`s may contain multiple updates, including, eg
a payment preimage after a commitment transaction update. While
such updates are generally not generated today, we shouldn't return
early out of the update loop, causing us to miss any updates after
an earlier update fails.
Matt Corallo [Mon, 22 Nov 2021 18:00:08 +0000 (18:00 +0000)]
Seal `scoring::Time` and only use `Instant` or `Eternity` publicly
`scoring::Time` exists in part to make testing the passage of time
in `Scorer` practical. To allow no-std users to provide a time
source it was exposed as a trait as well. However, it seems
somewhat unlikely that a no-std user is going to have a use for
providing their own time source (otherwise they wouldn't be a
no-std user), and likely they won't have a graph in memory either.
`scoring::Time` as currently written is also exceptionally hard to
write C bindings for - the C bindings trait mappings relies on the
ability to construct trait implementations at runtime with function
pointers (i.e. `dyn Trait`s). `scoring::Time`, on the other hand,
is a supertrait of `core::ops::Sub` which requires a `sub` method
which takes a type parameter and returns a type parameter. Both of
which aren't practical in bindings, especially given the
`Sub::Output` associated type is not bound by any trait bounds at
all (implying we cannot simply map the `sub` function to return an
opaque trait object).
Thus, for simplicity, we here simply seal `scoring::Time` and make
it effectively-private, ensuring the bindings don't need to bother
with it.
Matt Corallo [Mon, 22 Nov 2021 03:27:17 +0000 (03:27 +0000)]
Make `Score : Writeable` in c_bindings and impl on `LockedScore`
Ultimately we likely need to wrap the locked `Score` in a struct
that exposes writeable somehow, but because all traits have to be
fully concretized for C bindings we'll still need `Writeable` on
all `Score` in order to expose `Writeable` on the locked score.
Otherwise, we'll only have a `LockedScore` with a `Score` visible
that only has the `Score` methods, never the original type.
Matt Corallo [Tue, 9 Nov 2021 21:12:30 +0000 (21:12 +0000)]
Store holder channel reserve and max-htlc-in-flight explicitly
Previously, `holder_selected_channel_reserve_satoshis` and
`holder_max_htlc_value_in_flight_msat` were constant functions
of the channel value satoshis. However, in the future we may allow
allow users to specify it. In order to do so, we'll need to track
them explicitly, including serializing them as appropriate.
We go ahead and do so here, in part as it will make testing
different counterparty-selected channel reserve values easier.
Jeffrey Czyz [Thu, 18 Nov 2021 22:24:14 +0000 (16:24 -0600)]
Generate PaymentPathSuccessful event for each path
A single PaymentSent event is generated when a payment is fulfilled.
This is occurs when the preimage is revealed on the first claimed HTLC.
For subsequent HTLCs, the event is not generated.
In order to score channels involved with a successful payments, the
scorer must be notified of each successful path involved in the payment.
Add a PaymentPathSuccessful event for this purpose. Generate it whenever
a part is removed from a pending outbound payment. This avoids duplicate
events when reconnecting to a peer.
Matt Corallo [Sun, 21 Nov 2021 22:29:48 +0000 (22:29 +0000)]
Support `logger::Record` in C by String-ing the fmt::Arguments
This adds a new (non-feature) cfg argument `c_bindings` which will
be set when building C bindings. With this, we can (slightly) tweak
behavior and API based on whether we are being built for Rust or C
users.
Ideally we'd never need this, but as long as we can keep the API
consistent-enough to avoid material code drift, this gives us a
cheap way of doing the "right" thing for both C and Rust when the
two are in tension.
We also move lightning-background-processor to support the same
MSRV as the main lightning crate, instead of only
lightning-net-tokio's MSRV.
Matt Corallo [Wed, 10 Nov 2021 01:36:26 +0000 (01:36 +0000)]
Correct initial commitment tx fee affordability checks on open
Previously, we would reject inbound channels if the funder wasn't
able to meet our channel reserve on their first commitment
transaction only if they also failed to push enough to us for us
to not meet their initial channel reserve as well.
There's not a lot of reason to care about us meeting their reserve,
however - its largely expected that they may not push enough to us
in the initial open to meet it, and its not actually our problem if
they don't.
Further, we used our own fee, instead of the channel's actual fee,
to calculate fee affordability of the initial commitment
transaction.
We resolve both issues here, rewriting the combined affordability
check conditionals in inbound channel open handling and adding a
fee affordability check for outbound channels as well.
The prior code may have allowed a counterparty to start the channel
with "no punishment" states - violating the reason for the reserve
threshold.
Matt Corallo [Wed, 10 Nov 2021 03:44:04 +0000 (03:44 +0000)]
Rewrite test_update_fee_that_funder_cannot_afford to avoid magic
Instead of magic hard-coded constants, its better for tests to
derive the values used so that they change if constants are changed
and so that it is easier to re-derive constants in the future as
needed.
Matt Corallo [Mon, 15 Nov 2021 23:22:08 +0000 (23:22 +0000)]
Make Channel::commit_tx_fee_msat static and take fee explicitly
This may avoid risk of bugs in the future as it requires the caller
to think about the fee being used, not just blindly use the current
(committed) channel feerate.
In upcoming commits, we'll be making the payment secret and payment hash/preimage
derivable from info about the payment + a node secret. This means we don't
need to store any info about incoming payments and can eventually get rid of the
channelmanager::pending_inbound_payments map.
Matt Corallo [Fri, 12 Nov 2021 15:52:59 +0000 (15:52 +0000)]
Move `Score` into a `scoring` module instead of a top-level module
Traits in top-level modules is somewhat confusing - generally
top-level modules are just organizational modules and don't contain
things themselves, instead placing traits and structs in
sub-modules. Further, its incredibly awkward to have a `scorer`
sub-module, but only have a single struct in it, with the relevant
trait it is the only implementation of somewhere else. Not having
`Score` in the `scorer` sub-module is further confusing because
it's the only module anywhere that references scoring at all.
Matt Corallo [Fri, 12 Nov 2021 04:16:23 +0000 (04:16 +0000)]
Penalize large HTLCs relative to channels in default `Scorer`
Sending HTLCs which are any greater than a very small fraction of the
channel size tend to fail at a much higher rate. Thus, by default
we start applying a penalty at only 1/8th the channel size and
increase it linearly as the amount reaches the channel's capacity,
20 msat per 1024th of the channel capacity.
Matt Corallo [Fri, 12 Nov 2021 03:52:58 +0000 (03:52 +0000)]
Provide `Score` the HTLC amount and channel capacity
This should allow `Score` implementations to make substantially
better decisions, including of the form "willing to pay X to avoid
routing over this channel which may have a high failure rate".
Jeffrey Czyz [Fri, 12 Nov 2021 03:47:59 +0000 (21:47 -0600)]
Add PaymentHash parameter to Router::find_route
Implementations of Router may need the payment hash in order to look up
pre-computed routes from a probe for a given payment. Add a PaymentHash
parameter to Router::find_route to allow for this.
Jeffrey Czyz [Wed, 10 Nov 2021 00:07:23 +0000 (18:07 -0600)]
Replace expect_value_msat with expect_send
Modify all InvoicePayer unit tests to use expect_send instead of
expect_value_msat, since the former can discern whether the send was for
an invoice, spontaneous payment, or a retry. Updates tests to set payer
expectations if they weren't already and assert these before returning a
failure.
Jeffrey Czyz [Tue, 9 Nov 2021 23:32:37 +0000 (17:32 -0600)]
Support spontaneous payments in InvoicePayer
InvoicePayer handles retries not only when handling PaymentPathFailed
events but also for some types of PaymentSendFailure on the initial
send. Expand InvoicePayer's interface with a pay_pubkey function for
spontaneous (keysend) payments. Add a send_spontaneous_payment function
to the Payer trait to support this and implement it for ChannelManager.
Jeffrey Czyz [Tue, 9 Nov 2021 15:37:34 +0000 (09:37 -0600)]
Refactor InvoicePayer for spontaneous payments
To support spontaneous payments, InvoicePayer's sending logic must be
invoice-agnostic. Refactor InvoicePayer::pay_invoice_internal such that
invoice-specific code is in pay_invoice_using_amount and the remaining
logic is in pay_internal.
Further refactor the code's payment_cache locking such that it is
accessed consistently when needed, and tidy up the code a bit.
Matt Corallo [Tue, 9 Nov 2021 21:22:47 +0000 (21:22 +0000)]
Correct Channel type serialization logic
Currently, we write out the Channel's `ChannelTypeFeatures` as an
odd type, implying clients which don't understand the
`ChannelTypeFeatures` field can simply ignore it. This is obviously
nonsense if the channel type is some future version - the client
needs to fail to deserialize as it doesn't understand the channel's
type.
We adapt the serialization logic here to only write out the
`ChannelTypeFeatures` field if it is something other than
only-static-remote-key, and simply consider that "default" (as it
is the only supported type today). Then, we write out the channel
type as an even TLV, implying clients which do not understand it
must fail to read the `Channel`.
Note that we do not need to bother reserving the TLV type no longer
written as it never appeared in a release (merged post-0.0.103).
Matt Corallo [Wed, 13 Oct 2021 04:19:13 +0000 (04:19 +0000)]
Be less aggressive in outbound HTLC CLTV timeout checks
We currently assume our counterparty is naive and misconfigured and
may force-close a channel to get an HTLC we just forwarded them.
There shouldn't be any reason to do this - we don't have any such
bug, and we shouldn't start by assuming our counterparties are
buggy. Worse, this results in refusing to forward payments today,
failing HTLCs for largely no reason.
Instead, we keep a fairly conservative check, but not one which
will fail HTLC forwarding spuriously - testing only that the HTLC
doesn't expire for a few blocks from now.
Matt Corallo [Tue, 26 Oct 2021 21:40:14 +0000 (21:40 +0000)]
Fix a minor memory leak on PermanentFailure mon errs when sending
If we send a payment and fail to update the first-hop channel state
with a `PermanentFailure` ChannelMonitorUpdateErr, we would have an
entry in our pending payments map, but possibly not return the
PaymentId back to the user to retry the payment, leading to a (rare
and relatively minor) memory leak.
Matt Corallo [Mon, 4 Oct 2021 03:11:36 +0000 (03:11 +0000)]
Log before+after ChannelMonitor/Manager updates for visibility
I realized on my own node that I don't have any visibility into how
long a monitor or manager persistence call takes, potentially
blocking other operations. This makes it much more clear by adding
a relevant log_trace!() print immediately before and immediately
after persistence.