Delay RAA-after-next processing until PaymentSent is are handled 2023-02-sent-persist-order
authorMatt Corallo <git@bluematt.me>
Fri, 28 Jul 2023 05:30:24 +0000 (05:30 +0000)
committerMatt Corallo <git@bluematt.me>
Thu, 17 Aug 2023 22:19:48 +0000 (22:19 +0000)
commit31049ed96114e5ac610ef2e0349fac33d191434a
treed01b62d9cc7fadcdae0e4d6fff6dc92dc7592275
parent495c38c0d9fd6eb7e52ad83c5634d91af33fe446
Delay RAA-after-next processing until PaymentSent is are handled

In 0ad1f4c943bdc9037d0c43d1b74c745befa065f0 we fixed a nasty bug
where a failure to persist a `ChannelManager` faster than a
`ChannelMonitor` could result in the loss of a `PaymentSent` event,
eventually resulting in a `PaymentFailed` instead!

As noted in that commit, there's still some risk, though its been
substantially reduced - if we receive an `update_fulfill_htlc`
message for an outbound payment, and persist the initial removal
`ChannelMonitorUpdate`, then respond with our own
`commitment_signed` + `revoke_and_ack`, followed by receiving our
peer's final `revoke_and_ack`, and then persist the
`ChannelMonitorUpdate` generated from that, all prior to completing
a `ChannelManager` persistence, we'll still forget the HTLC and
eventually trigger a `PaymentFailed` rather than the correct
`PaymentSent`.

Here we fully fix the issue by delaying the final
`ChannelMonitorUpdate` persistence until the `PaymentSent` event
has been processed and document the fact that a spurious
`PaymentFailed` event can still be generated for a sent payment.

The original fix in 0ad1f4c943bdc9037d0c43d1b74c745befa065f0 is
still incredibly useful here, allowing us to avoid blocking the
first `ChannelMonitorUpdate` until the event processing completes,
as this would cause us to add event-processing delay in our general
commitment update latency. Instead, we ultimately race the user
handling the `PaymentSent` event with how long it takes our
`revoke_and_ack` + `commitment_signed` to make it to our
counterparty and receive the response `revoke_and_ack`. This should
give the user plenty of time to handle the event before we need to
make progress.

Sadly, because we change our `ChannelMonitorUpdate` semantics, this
change requires a number of test changes, avoiding checking for a
post-RAA `ChannelMonitorUpdate` until after we process a
`PaymentSent` event. Note that this does not apply to payments we
learned the preimage for on-chain - ensuring `PaymentSent` events
from such resolutions will be addressed in a future PR. Thus, tests
which resolve payments on-chain switch to a direct call to the
`expect_payment_sent` function with the claim-expected flag unset.
13 files changed:
lightning-invoice/src/utils.rs
lightning/src/chain/chainmonitor.rs
lightning/src/events/mod.rs
lightning/src/ln/chanmon_update_fail_tests.rs
lightning/src/ln/channel.rs
lightning/src/ln/channelmanager.rs
lightning/src/ln/functional_test_utils.rs
lightning/src/ln/functional_tests.rs
lightning/src/ln/monitor_tests.rs
lightning/src/ln/outbound_payment.rs
lightning/src/ln/payment_tests.rs
lightning/src/ln/reorg_tests.rs
lightning/src/ln/shutdown_tests.rs