fix(send): restart the fast-lane worker on reconnect so the opid poll runs

A completed send could spin forever on "Waiting for operation (N)". Root
cause: onDisconnected() stopped fast_worker_ but kept the unique_ptr, so
onConnected()'s `if (!fast_worker_)` guard never restarted it — after the
first reconnect (daemon warmup, restart, any RPC blip) the fast lane stayed
dead for the whole session.

The opid poll was the only fast_worker_ user that posted to it directly with
no fallback, so it alone broke: its post() landed on a stopped thread, the
result MainCb never ran, opid_poll_in_progress_ stuck true, and the poll never
fired again — leaving the operation (already "success" on the daemon, with a
txid) untracked.

Two fixes:
- onDisconnected() now reset()s fast_worker_ after stop(), so onConnected
  recreates and starts a fresh one (restores the fast lane for all its users,
  not just the poll).
- the opid poll now falls back to worker_ when the fast lane isn't running,
  matching every other fast_worker_ call site — defense in depth.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-12 01:53:30 -05:00
parent bf91c4eb6c
commit 9ee8f9a43b
2 changed files with 13 additions and 3 deletions

View File

@@ -760,13 +760,18 @@ void App::update()
} }
} }
// Poll pending z_sendmany operations for completion (full-node opid flow; lite has none) // Poll pending z_sendmany operations for completion (full-node opid flow; lite has none).
// Prefer the fast lane but fall back to the main worker (mirrors every other fast_worker_ user)
// so a torn-down/not-yet-restarted fast lane can't silently strand the poll on "Waiting for
// operation" — the symptom when fast_worker_ was stopped on reconnect and never came back.
rpc::RPCWorker* opidWorker = (fast_worker_ && fast_worker_->isRunning())
? fast_worker_.get() : worker_.get();
if (network_refresh_.isDue(RefreshTimer::Opid) && !pending_opids_.empty() if (network_refresh_.isDue(RefreshTimer::Opid) && !pending_opids_.empty()
&& rpcConnected && fast_worker_ && !opid_poll_in_progress_) { && rpcConnected && opidWorker && !opid_poll_in_progress_) {
network_refresh_.reset(RefreshTimer::Opid); network_refresh_.reset(RefreshTimer::Opid);
auto opids = pending_opids_; // copy for worker thread auto opids = pending_opids_; // copy for worker thread
opid_poll_in_progress_ = true; opid_poll_in_progress_ = true;
fast_worker_->post([this, opids]() -> rpc::RPCWorker::MainCb { opidWorker->post([this, opids]() -> rpc::RPCWorker::MainCb {
auto* rpc = (fast_rpc_ && fast_rpc_->isConnected()) ? fast_rpc_.get() : rpc_.get(); auto* rpc = (fast_rpc_ && fast_rpc_->isConnected()) ? fast_rpc_.get() : rpc_.get();
if (!rpc) return [this](){ opid_poll_in_progress_ = false; }; if (!rpc) return [this](){ opid_poll_in_progress_ = false; };
json ids = json::array(); json ids = json::array();

View File

@@ -582,6 +582,11 @@ void App::onDisconnected(const std::string& reason)
if (fast_rpc_) fast_rpc_->requestAbort(); if (fast_rpc_) fast_rpc_->requestAbort();
if (fast_worker_) { if (fast_worker_) {
fast_worker_->stop(); fast_worker_->stop();
// Drop the stopped worker so onConnected recreates and starts a fresh one. Keeping a
// stopped-but-present worker would defeat onConnected's `if (!fast_worker_)` guard, leaving
// the fast lane dead for the rest of the session — which silently stalls the opid poll
// (its post() never runs, so a completed send spins on "Waiting for operation").
fast_worker_.reset();
} }
if (fast_rpc_) { if (fast_rpc_) {
fast_rpc_->disconnect(); fast_rpc_->disconnect();