Matt Corallo [Sun, 16 Jul 2023 17:20:56 +0000 (17:20 +0000)]
Drop overly optimistic index
The `channel_updates_id_with_scid_dir_blob` index allows the
intermediate-row-fetching logic to be index-only, but there's very
little reason to do so - we now use subqueries to build the exact
set of rows we want, by id, and then fetch various colums. Having
an index that lets us look up those columns without hitting the
regular table is fine, but there's not a ton of cost to hitting the
table by primary key and maintaining yet another index isn't free.
Matt Corallo [Sun, 16 Jul 2023 03:20:52 +0000 (03:20 +0000)]
Don't hold the `NetworkGraph` read lock across an await point
Holding the `NetworkGraph` read lock across a query await point
can cause a deadlock if another task tries to handle a gossip
message at the same time.
Matt Corallo [Sun, 16 Jul 2023 00:37:05 +0000 (00:37 +0000)]
Switch to streaming queries
In order to use streaming queries we have to use `tokio-postgres`'s
`query_raw` command, rather than `query`. This should reduce our
memory footprint from 10+GB to well under one.
The `consider_intermediate_updates` flag is always set, and must be
set for correctness, so we remove it. Further, we optimize the
query that hung on it somewhat by removing an uneccessary
`ORDER BY` clause which was only neccessary if
`consider_intermediate_updates` were unset.
Matt Corallo [Sat, 15 Jul 2023 06:42:33 +0000 (06:42 +0000)]
Substantially optimize reference-row-fetching
By first fetching the rows we need from a smaller index, we avoid
walking a large index which contained the full `blob_signed`. This
reduces reference-row-fetching from 680 seconds to 152 seconds when
searching today for reference rows against 7 days ago.
Old:
```
ln-gossip=# EXPLAIN ANALYZE SELECT DISTINCT ON (short_channel_id, direction) id, blob_signed, direction
FROM channel_updates
WHERE seen < '2023-07-07 00:00:00' AND short_channel_id IN (
SELECT DISTINCT ON (short_channel_id) short_channel_id
FROM channel_updates
WHERE seen >= '2023-07-07 00:00:00'
)
ORDER BY short_channel_id ASC, direction ASC, seen DESC;
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=186279.46..11921204.82 rows=168910 width=161) (actual time=732.365..680504.173 rows=129985 loops=1)
-> Merge Join (cost=186279.46..11632998.93 rows=57641177 width=161) (actual time=732.364..679193.755 rows=31714061 loops=1)
Merge Cond: (channel_updates.short_channel_id = channel_updates_1.short_channel_id)
-> Index Only Scan using channel_updates_scid_dir_seen on channel_updates (cost=0.56..10718853.69 rows=57641177 width=161) (actual time=0.638..673675.749 rows=57408667 loops=1)
Index Cond: (seen < '2023-07-07 00:00:00'::timestamp without time zone)
Heap Fetches: 0
-> Unique (cost=186278.90..192574.84 rows=84455 width=8) (actual time=478.881..750.241 rows=68210 loops=1)
-> Sort (cost=186278.90..189426.87 rows=1259188 width=8) (actual time=478.878..653.035 rows=1452661 loops=1)
Sort Key: channel_updates_1.short_channel_id
Sort Method: external merge Disk: 17680kB
-> Index Only Scan using channel_updates_seen_scid on channel_updates channel_updates_1 (cost=0.56..41481.08 rows=1259188 width=8) (actual time=0.885..264.333 rows=1504495 loops=1)
Index Cond: (seen >= '2023-07-07 00:00:00'::timestamp without time zone)
Heap Fetches: 2273
Planning Time: 0.164 ms
JIT:
Functions: 9
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 21.265 ms, Inlining 37.914 ms, Optimization 113.040 ms, Emission 101.901 ms, Total 274.121 ms
Execution Time: 680601.155 ms
(19 rows)
```
New:
```
ln-gossip=# EXPLAIN ANALYZE SELECT id, direction, blob_signed FROM channel_updates
WHERE id IN (
SELECT DISTINCT ON (short_channel_id, direction) id
FROM channel_updates
WHERE seen < '2023-07-07 00:00:00'
ORDER BY short_channel_id ASC, direction ASC, seen DESC
) AND short_channel_id IN (
SELECT DISTINCT ON (short_channel_id) short_channel_id
FROM channel_updates
WHERE seen >= '2023-07-07 00:00:00'
);
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=2942503.92..2943867.77 rows=169870 width=145) (actual time=22862.627..152436.685 rows=130116 loops=1)
Hash Cond: (channel_updates.short_channel_id = channel_updates_2.short_channel_id)
-> Nested Loop (cost=2738282.26..2739200.18 rows=169870 width=153) (actual time=22141.452..151504.140 rows=393250 loops=1)
-> HashAggregate (cost=2738281.69..2738283.69 rows=200 width=4) (actual time=22139.440..22339.035 rows=393250 loops=1)
Group Key: channel_updates_1.id
Batches: 1 Memory Usage: 45089kB
-> Result (cost=0.56..2736158.32 rows=169870 width=21) (actual time=0.102..21984.409 rows=393250 loops=1)
-> Unique (cost=0.56..2736158.32 rows=169870 width=21) (actual time=0.074..21943.089 rows=393250 loops=1)
-> Index Only Scan using channel_updates_scid_dir_seen_desc_with_id on channel_updates channel_updates_1 (cost=0.56..2448011.03 rows=57629457 width=21) (actual time=0.073..19776.181 rows=57408667 loops=1)
Index Cond: (seen < '2023-07-07 00:00:00'::timestamp without time zone)
Heap Fetches: 0
-> Index Only Scan using channel_updates_id_with_scid_dir_blob on channel_updates (cost=0.56..4.60 rows=1 width=153) (actual time=0.328..0.328 rows=1 loops=393250)
Index Cond: (id = channel_updates_1.id)
Heap Fetches: 0
-> Hash (cost=203159.97..203159.97 rows=84935 width=8) (actual time=721.105..721.107 rows=70731 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 3787kB
-> Unique (cost=195708.67..202310.62 rows=84935 width=8) (actual time=552.965..713.465 rows=70731 loops=1)
-> Sort (cost=195708.67..199009.65 rows=1320391 width=8) (actual time=552.962..650.323 rows=1537141 loops=1)
Sort Key: channel_updates_2.short_channel_id
Sort Method: external merge Disk: 18064kB
-> Index Only Scan using channel_updates_seen_scid on channel_updates channel_updates_2 (cost=0.56..43421.19 rows=1320391 width=8) (actual time=66.736..324.130 rows=1537141 loops=1)
Index Cond: (seen >= '2023-07-07 00:00:00'::timestamp without time zone)
Heap Fetches: 68
Planning Time: 0.520 ms
JIT:
Functions: 21
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.643 ms, Inlining 7.055 ms, Optimization 33.167 ms, Emission 25.782 ms, Total 66.648 ms
Execution Time: 152458.777 ms
(29 rows)
```
Matt Corallo [Thu, 6 Jul 2023 16:43:05 +0000 (16:43 +0000)]
Require DB insertions to complete in fifteen seconds
For some reason the mainnet server hung, seemingly on the DB
insertion task. This will improve debugging by simply crashing if
an insertion takes longer than five seconds.
Matt Corallo [Sun, 2 Jul 2023 17:17:07 +0000 (17:17 +0000)]
Build reminder updates with correct SCID field
When the reminder updates were added, a dummy `ChannelUpdate` with
a number of zero'd fields were created under the assumption that
the zero'd fields would be ignored downstream when building
serialized updates. However, the SCID field was `assert`'ed on (and
serialized in the update), causing any reminder updates to cause an
assertion panic.
Instead, we do it the Right Way (tm) here and move the
only-sometimes-available fields into the update type enum, ensuring
we can't access "poison" fields downstream.
Matt Corallo [Mon, 5 Jun 2023 23:26:38 +0000 (23:26 +0000)]
Removed unused_mut rejection and fix some unused `mut`s
Making warnings a hard failure is generally bad practice as it can
result in new compiler versions failing to compile
otherwise-totally-acceptable code, which in this case is happening
on rustc beta, which is now warning for new cases of unused mut.
Andrei [Tue, 16 May 2023 00:00:00 +0000 (00:00 +0000)]
Fix dummy symlink
The commit changes the symlink for the snapshot for the current date
from `./res/snapshots_pending/empty_delta.lngossip` to
`../snapshots/empty_delta.lngossip` such that nginx does not
retrun 404, but 200 with the dummy snapshot
Matt Corallo [Wed, 14 Sep 2022 20:12:17 +0000 (20:12 +0000)]
Add one additional index which postgres prefers as the DB fills
If postgres decides walking the full `channel_updates_scid_dir_seen`
index and removing old `seen` values is slower than just walking
the full table (or this new index) it does so. Sadly this causes
re-sorting (usually on-disk), but there doesn't seem to be a way to
avoid this.
Matt Corallo [Mon, 22 Aug 2022 04:14:24 +0000 (04:14 +0000)]
Don't hold the counter lock while verifying gossip/waiting on DB
This resolves a deadlock if we block on the DB where we have one
thread blocked waiting on DB in a blocking thread, and the tokio
reactor blocked waiting on the counter lock which the blocking
thread holds.