Matt Corallo [Wed, 5 May 2021 22:56:42 +0000 (22:56 +0000)]
Append backwards-compat TLVs to serialization of larger structs
Currently our serialization is very compact, and contains version
numbers to indicate which versions the code can read a given
serialized struct. However, if you want to add a new field without
needlessly breaking the ability of previous versions of the code to
read the struct, there is not a good way to do so.
This adds dummy, currently empty, TLVs to the major structs we
serialize out for users, providing an easy place to put new
optional fields without breaking previous versions.
Matt Corallo [Tue, 9 Feb 2021 20:22:44 +0000 (15:22 -0500)]
Read monitors from our KeysInterface in chanmon_consistency_fuzz
If the fuzz target is failing due to a channel force-close, the
immediately-visible error is that we're signing a stale state. This
is because the ChannelMonitorUpdateStep::ChannelForceClosed event
results in a signature in the test clone which was deserialized
using a OnlyReadsKeysInterface. Instead, we need to deserialize
using the full KeysInterface instance.
Matt Corallo [Fri, 20 Nov 2020 20:49:53 +0000 (15:49 -0500)]
Stop failing back HTLCs on peer disconnection
Previously, if we got disconnected from a peer while there were
HTLCs pending forwarding in the holding cell, we'd clear them and
fail them all backwards. This is largely fine, but since we now
have support for handling such HTLCs on reconnect, we might as
well not, instead relying on our timeout logic to fail them
backwards if it takes too long to forward them.
Matt Corallo [Wed, 21 Apr 2021 02:37:02 +0000 (02:37 +0000)]
[fuzz] Handle monitor updates during get_and_clear_pending_msg_events
Because we may now generate a monitor update during
get_and_clear_pending_msg_events calls, we need to ensure we
re-serialize the relevant ChannelManager before attempting to
reload it, if such a monitor update occurred.
Matt Corallo [Thu, 18 Mar 2021 22:03:30 +0000 (18:03 -0400)]
Free holding cell on monitor-updating-restored when there's no upd
If there is no pending channel update messages when monitor updating
is restored (though there may be an RAA to send), and we're
connected to our peer and not awaiting a remote RAA, we need to
free anything in our holding cell.
However, we don't want to immediately free the holding cell during
channel_monitor_updated as it presents a somewhat bug-prone case of
reentrancy:
a) it would re-enter user code around a monitor update while being
called from user code notifying us of the same monitor being
updated, making deadlocs very likely (in fact, our fuzzers
would have a bug here!),
b) the re-entrancy only occurs in a very rare case, making it
likely users will not hit it in testing, only deadlocking in
production.
Thus, we add a holding-cell-free pass over each channel in
get_and_clear_pending_msg_events. This fits up nicely with the
anticipated bug - users almost certainly need to process new
network messages immediately after monitor updating has been
restored to send messages which were not sent originally when the
monitor updating was paused.
Without this, chanmon_fail_consistency was able to find a stuck
condition where we sit on an HTLC failure in our holding cell and
don't ever handle it (at least until we have other actions to take
which empty the holding cell).
Matt Corallo [Thu, 18 Mar 2021 22:23:05 +0000 (18:23 -0400)]
DRY ChannelError conversion macros
Both break_chan_entry and try_chan_entry do almost identical work,
only differing on if they `break` or `return` in response to an
error. Because we will now also need an option to do neither, we
break out the common code into a shared `convert_chan_err` macro.
Matt Corallo [Tue, 24 Nov 2020 00:12:31 +0000 (19:12 -0500)]
[fuzz] Allow SendAnnouncementSigs events in chanmon_consistency
Because of the merge between peer reconnection and channel monitor
updating channel restoration code, we now sometimes generate
(somewhat spurious) announcement signatures when restoring channel
monitor updating. This should not result in a fuzzing failure.
Matt Corallo [Tue, 24 Nov 2020 00:12:19 +0000 (19:12 -0500)]
[fuzz] Be more strict about msg events in chanmon_consistency
This fails chanmon_consistency on IgnoreError error events and on
messages left over to be sent to a just-disconnected peer, which
should have been drained.
These should never appear, so consider them a fuzzer fail case.
Matt Corallo [Fri, 20 Nov 2020 19:29:33 +0000 (14:29 -0500)]
Move channel restoration after monitor update to a two-part macro
The channel restoration code in channel monitor updating and peer
reconnection both do incredibly similar things, and there is
little reason to have them be separate. Sadly because they require
holding a lock with a reference to elements in the lock, its not
practical to make them utility functions, so instead we introduce
a two-step macro here which will eventually be used for both.
Because we still support pre-NLL Rust, the macro has to be in two
parts - one which runs with the channel_state lock, and one which
does not.
Matt Corallo [Mon, 23 Nov 2020 23:22:29 +0000 (18:22 -0500)]
[fuzz] Print the output of all failed test cases, not one test.
Our fuzz tests previously only printed the log output of the first
fuzz test case to fail. This commit changes that (with lots of
auto-generated updates) to ensure we print all log outputs.
Matt Corallo [Sun, 9 May 2021 19:19:11 +0000 (19:19 +0000)]
Make payments not duplicatively fail/succeed on reload/reconnect
We currently generate duplicative PaymentFailed/PaymentSent events
in two cases:
a) If we receive a update_fulfill_htlc message, followed by a
disconnect, then a resend of the same update_fulfill_htlc
message, we will generate a PaymentSent event for each message.
b) When a Channel is closed, any outbound HTLCs which were relayed
through it are simply dropped when the Channel is. From there,
the ChannelManager relies on the ChannelMonitor having a copy of
the relevant fail-/claim-back data and processes the HTLC
fail/claim when the ChannelMonitor tells it to.
If, due to an on-chain event, an HTLC is failed/claimed, and
then we serialize the ChannelManager, but do not re-serialize
the relevant ChannelMonitor, we may end up getting a duplicative
event.
In order to provide the expected consistency, we add explicit
tracking of pending outbound payments using their unique
session_priv field which is generated when the payment is sent.
Then, before generating PaymentFailed/PaymentSent events, we check
that the session_priv for the payment is still pending.
Antoine Riard [Sat, 15 May 2021 21:20:10 +0000 (17:20 -0400)]
Split `sign_justice_transaction` in two halves
To avoid caller data struct storing HTLC-related information when
a revokeable output is claimed on top of a commitment/second-stage
HTLC transactions, we split `keysinterface::sign_justice_transaction`
in two new halves `keysinterfaces::sign_justice_revoked_output` and
`keysinterfaces::sign_justice_revoked_htlc`.
Further, this split offers more flexibility to signer policy as a
commitment revokeable output might be of a value far more significant
than HTLC ones.
Matt Corallo [Fri, 7 May 2021 22:17:29 +0000 (22:17 +0000)]
Do not wait in PersistenceNotifier when the persist flag is set
When we had a event which caused us to set the persist flag in a
PersistenceNotifier in between wait calls, we will still wait,
potentially not persisting a ChannelManager when we should.
Worse, for wait_timeout, this caused us to always wait up to the
timeout, but then always return true that a persistence is needed.
Instead, we simply check the persist flag before waiting, returning
immediately if it is set.
Matt Corallo [Fri, 7 May 2021 22:16:47 +0000 (22:16 +0000)]
Avoid persisting a ChannelManager update after each timer tick
Currently, when a user calls `ChannelManager::timer_tick_occurred`
we always set the persister's update flag to true. This results in
a ChannelManager persistence after each timer tick, even when
nothing happened.
Instead, we add a new flag to `PersistenceNotifierGuard` to
indicate if we should skip setting the update flag.
Matt Corallo [Fri, 7 May 2021 20:56:10 +0000 (20:56 +0000)]
Send update_channel messages to re-enable a disabled channel
Currently, we only send an update_channel message after
disconnecting a peer and waiting some time. We do not send a
followup when the peer has been reconnected for some time.
This changes that behavior to make the disconnect and reconnect
channel updates symmetric, and also simplifies the state machine
somewhat to make it more clear.
Finally, it serializes the current announcement state so that we
usually know when we need to send a new update_channel.
Matt Corallo [Thu, 6 May 2021 20:42:02 +0000 (20:42 +0000)]
Increase the timeout for RPC responses from Bitcoin Core
Early sample testing showed multiple users hitting
EWOULDBLOCK/EAGAIN waiting for an initial response from Bitcoin
Core while it was doing some long operation (eg UTXO cache
flushing). Instead of only waiting 5 seconds for each attempt, we
now wait a full two minutes, but only for the first header
response, not each byte.
Matt Corallo [Wed, 5 May 2021 02:17:02 +0000 (02:17 +0000)]
Correct MIN_FINAL_CLTV_EXPIRY to match our enforced requirements
Our enforced requirements for HTLC acceptance is that we have at
least HTLC_FAIL_BACK_BUFFER blocks before the HTLC expires. When we
receive an HTLC, the HTLC would be "already expired" if its
`cltv_expiry` is current-block + 1 (ie the next block could
broadcast the commitment transaction and time out the HTLC). From
there, we want an extra HTLC_FAIL_BACK_BUFFER in blocks, plus an
extra block or two to account for any differences in the view of
the current height before send or while the HTLC is transiting the
network.
Matt Corallo [Wed, 5 May 2021 02:04:58 +0000 (02:04 +0000)]
Increase the CLTV delay required on payments and forwards
This increases the CLTV_CLAIM_BUFFER constant to 18, much better
capturing how long it takes to go on chain to claim payments.
This is also more in line with other clients, and the spec, which
sets the default CLTV delay in invoices to 18.
As a side effect, we have to increase MIN_CLTV_EXPIRY_DELTA as
otherwise as are subject to an attack where someone can hold an
HTLC being forwarded long enough that we *also* close the channel
on which we received the HTLC.
Matt Corallo [Wed, 5 May 2021 00:19:11 +0000 (00:19 +0000)]
By default sort network addrs before inclusion in node_announcements
In #797, we stopped enforcing that read/sent node_announcements
had their addresses sorted. While this is fine in practice, we
should still make a best-effort to sort them to comply with the
spec's forward-compatibility requirements, which we do here in the
ChannelManager.
Since InvoiceFeatures are an implementation detail of InvoiceBuilder, an
explicit call is needed to support the basic_mpp feature. Since it is
dependent on the payment_secret feature, conditionally define the
builder's method only when payment_secret has been set.
Instead of relying on users to set an invoice's features correctly,
enforce the semantics inside InvoiceBuilder. For instance, if the user
sets a PaymentSecret then InvoiceBuilder should ensure the appropriate
feature bits are set. Thus, for this example, the TaggedField
abstraction can be retained while still ensuring BOLT 11 semantics at
the builder abstraction.
Antoine Riard [Tue, 16 Mar 2021 22:07:22 +0000 (18:07 -0400)]
Replace config max counterpary `dust_limit_satoshis` by a constant.
Current Bitcoin Core's policy will reject a p2wsh as a dust if it's
under 330 satoshis. A typical p2wsh output is 43 bytes big to which
Core's `GetDustThreshold()` sums up a minimal spend of 67 bytes (even
if a p2wsh witnessScript might be smaller). `dustRelayFee` is set
to 3000 sat/kb, thus 110 * 3000 / 1000 = 330. As all time-sensitive
outputs are p2wsh, a value of 330 sat is the lower bound desired
to ensure good propagation of transactions. We give a bit margin to
our counterparty and pick up 660 satoshis as an accepted
`dust_limit_satoshis` upper bound.
As this reasoning is tricky and error-prone we hardcode it instead of
letting the user picking up a non-sense value.
Further, this lower bound of 330 sats is also hardcoded as another constant
(MIN_DUST_LIMIT_SATOSHIS) instead of being dynamically computed on
feerate (derive_holder_dust_limit_satoshis`). Reducing risks of
non-propagating transactions in casee of failing fee festimation.
Matt Corallo [Fri, 30 Apr 2021 04:19:51 +0000 (04:19 +0000)]
Use explicit import lists instead of glob imports in invoice
While this is less readable, I spent way too long trying to adapt
the bindings generation code to handle glob imports and concluded
it would take refactoring almost the entire import-resolution
logic. While this may be a good refactor to do eventually, its
probably not worth it today.
Matt Corallo [Fri, 30 Apr 2021 18:45:51 +0000 (18:45 +0000)]
Set default error type for SignOrCreationError for bindings
The C bindings generator now looks to default generic types as the
way to map a struct or enum parameter. Because SignOrCreationError
is only used directly with an error type of `()`, we set that to
the default and assume no other error types are needed.