Avoid a `tokio::mpsc::Sender` clone for each P2P send operation
Whenever we go to send bytes to a peer, we need to construct a
waker for tokio to call back into if we need to finish sending
later. That waker needs some reference to the peer's read task to
wake it up, hidden behind a single `*const ()`. To do this, we'd
previously simply stored a `Box<tokio::mpsc::Sender>` in that
pointer, which requires a `clone` for each waker construction. This
leads to substantial malloc traffic.
Instead, here, we replace this box with an `Arc`, leaving a single
`tokio::mpsc::Sender` floating around and simply change the
refcounts whenever we construct a new waker, which we can do
without allocations.