Files
ObsidianDragon/docs/lite-wallet-implementation-plan-v2-2026-06-04.md
DanS 3119440cd9 fix(lite): non-blocking, non-hanging sync (Finding B)
The backend `sync` command is a blocking, uninterruptible full chain scan (do_sync(true);
does not honor the shutdown flag), and balance/list block until synced. Previously
startSync() ran on the main thread (would freeze wallet creation) and the worker could
block, making the destructor join() hang at shutdown.

Redesign:
- bridge is now std::shared_ptr<LiteClientBridge>, shared with a detached sync thread so
  detaching is safe and litelib_shutdown isn't called while a running sync still holds the
  bridge; the controller's own ref prevents premature shutdown during normal operation.
- startSync() launches the blocking `sync` on a detached thread (non-blocking; never joined).
- refreshModel() gates on syncDone_: while syncing it publishes syncstatus progress only;
  once synced it does the full balance/addresses/list refresh (now fast).
- destructor joins only the fast poll worker and detaches the sync thread -> no hang.
- syncComplete() accessor added.

Tests (deterministic, via a blocking-sync fake; counters made atomic for the detached
thread): testLiteWalletControllerShutdownDoesNotHangDuringSync (destructor returns <1.5s
with sync blocked); refresh/worker tests wait for syncComplete()/a balance-bearing model.
Stable across repeated runs; lite+backend and full-node apps build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 06:35:26 -05:00

22 KiB
Raw Blame History

Lite Wallet Implementation Plan v2 — 2026-06-04

Status: Active. Supersedes docs/full-lite-wallet-implementation-plan-2026-05-18.md (archived).

Context — why this plan replaces the v1 plan

The v1 ("Full Lite Wallet Implementation Plan", 2026-05-18) drove the lite effort for ~6 weeks and produced a large amount of code, but zero end-to-end wallet functionality. Its method — build every layer of every phase in a typed-disabled form and "promote one disabled scaffold at a time" through readiness/custody/handoff governance — generated ~160 dead lite_wallet_*_plan/*_batch* files (filenames up to 250 chars) and a 33k-line test file that exercised only disabled scaffolding. Those files were deleted on 2026-06-04 (branch cleanup/lite-plan-churn); a scripts/check-source-hygiene.sh pre-commit guard now blocks their regrowth.

A verified audit (8-agent review, 2026-06-04) established the real current state below. This plan keeps the v1 plan's dependency ordering and ground rules (which were sound) but discards its method: it switches from horizontal disabled scaffolding to vertical working slices — each milestone makes one capability work end-to-end and demoable, gated by a deterministic fake-backend test, before the next begins.

Verified current state (2026-06-04)

Real and working when enabled:

  • C-ABI bridge lite_client_bridge.{h,cpp} — makes real litelib_* calls via function pointers with copy-before-free Rust string ownership (lite_client_bridge.cpp:120-183). Gated by #if DRAGONX_ENABLE_LITE_BACKEND (default 0).
  • CMake import contract — validates build coupling, imported-only link mode, ABI sdxl-c-v1, all 8 required symbols, optional signature (CMakeLists.txt:62-145).
  • Capability gating — wallet_capabilities.h correctly gates lite features at compile + runtime.
  • Result parsers + state mapper — lite_result_parsers.{h,cpp}, lite_wallet_state_mapper.{h,cpp} convert JSON → typed state (real, but never called).
  • Backend artifact script — scripts/build-lite-backend-artifact.sh (reproducible; Rust source vendored at external/SilentDragonXLite/). Built Linux/Windows artifacts during dev, but build/lite-backend/ is absent and CMake never invokes it.

Real but unreachable (the core gap is integration):

  • LiteWalletLifecycleService (real bridge_.initialize*), LiteSyncService, LiteConnectionService, LiteWalletGateway (real bridge_.execute) — all contain real bridge calls but are never constructed anywhere, default to allowBridgeCalls=false, and the only wired UI path is a validation-only adapter that returns RuntimeExecutionDisabled for every real action.

Not started / disabled: sync loop (startSync returns "not implemented"), WalletState application (explicit StateMutationDisabled), send/import/export, wallet-file persistence, lite data UI, dynamic loading (only an unbuilt scaffold; imported-link only), macOS (operator-deferred).

Known defects to fix while wiring: (1) seed/passphrase flow through request structs but are only flagged redacted, never zeroed in memory; (2) the lifecycle bridge response JSON is discarded, never parsed, so walletReady is permanently false.

Ground rules (carried over from v1 — still binding)

  • Keep full-node runtime behavior unchanged; share only through narrow abstractions.
  • Never fake balances, sync state, transactions, send results, wallet existence, or server connectivity. A disabled feature must look disabled, not fake-succeed.
  • Never log seed phrases, passphrases, private keys, decrypted memos, or raw bridge responses. Additionally: zero secret buffers (sodium_memzero) after use — fixing the v1 gap.
  • Copy Rust-returned strings before free; free exactly once; no raw pointer escape (already done in lite_client_bridge).
  • Each milestone ships a deterministic test against an injected fake backend before any real-backend smoke test.
  • Keep release packaging honest: lite artifacts bundle no daemon/Sapling/asmap assets; full-node artifacts stay intact.

Architectural decisions (resolve up front)

  1. Canonical bridge seam = lite_client_bridge. It already makes real litelib_* calls and is what the services use. The elaborate lite_bridge_runtime (~1.7 MB cpp, fake-only dry-dispatch + disabled phase-N scaffolds) is not the execution path. Keep only its genuinely-useful primitives (LiteBridgeOwnedString copy-before-free, idempotent teardown — already referenced by lite_client_bridge.cpp:182) and retire the disabled dispatch/promotion scaffolding (subject to the hygiene guard, same as the earlier churn cleanup). Do not invest further in lite_bridge_runtime dry-dispatch.

  2. App-owned LiteWalletController, mirroring the full-node ownership pattern. The full-node path owns WalletState state_ (app.h:421) as the single source of truth, driven by NetworkRefreshService + RefreshScheduler via enqueue → RPCWorker → callback → apply-to-state_ (app_network.cpp:912-944, refreshCoreData ~:1001-1043), and every tab reads app->state() (app.h:157-159). The lite controller constructs LiteClientBridge::linkedSdxl() + the lite services with allowBridgeCalls=true, runs the same enqueue/worker/apply loop, and feeds the same WalletState — so the existing Balance/Send/Receive/Transactions tabs light up with no per-tab changes.

  3. Implement the existing LiteWalletBackend abstraction (wallet_backend.h:59-82: setServer/openWallet/startSync/cancelSync/sendTransaction). It exists and is unused; the controller is its first implementation. Branch in App::init() on supportsLiteBackend() (app.h:134-140).

  4. WalletState is backend-agnostic and stays the single source of truth. Lite-sourced data lands in the same fields the tabs already read: balances, addresses (AddressInfo), transactions (TransactionInfo), sync (SyncInfo) in data/wallet_state.h.

  5. Deterministic fake SDXL .so. Build a tiny fake backend implementing the 8 litelib_* symbols with canned JSON, used to link a test target and drive every milestone's acceptance test without a real backend or network (satisfies the "injected fake bridge before real smoke test" rule). Real-backend smoke tests follow per milestone.

Milestones (vertical slices)

Each milestone is independently demoable and gated by a fake-backend test. Order respects the v1 dependency chain (bridge → lifecycle → sync → state → UI → send → ship).

M0 — Foundation: real artifact, linked build, fake-backend harness

Goal: A lite build that links a backend and a test harness that can drive the bridge deterministically.

Status (2026-06-04): mostly complete.

  • Real Linux artifact built — scripts/build-lite-backend-artifact.sh --platform linuxbuild/lite-backend/linux/libsilentdragonxlite.a (126 MB) with all 8 litelib_* symbols + manifest (sha256 c06f5679…). build/ is gitignored.
  • build.sh --lite-backend flag added (auto-discovers the artifact; DRAGONX_LITE_BACKEND_DIR override).
  • CMake import contract validated against the real artifact; ObsidianDragonLite (93 MB) links it (nm shows litelib_execute/initialize_new/shutdown as defined T).
  • Deterministic injectable fake harness — tests/fake_lite_backend.h (makeFakeLiteApi(), owned-string alloc/free accounting) + testLiteBackendInjectableFakeBridge() in the suite (ctest green). This is the harness M1 service tests reuse — it needs neither the real .so nor Rust.
  • Deferred to a focused follow-up: (a) a standalone fake .so link-target for Rust-less CI (the real artifact covers the link path locally); (b) retiring the lite_bridge_runtime disabled-dispatch scaffolding — large/risky surgical cleanup, not required to unblock M1; do it with the same care as the churn cleanup.
  • Run scripts/build-lite-backend-artifact.sh against external/SilentDragonXLite to produce a Linux litelib artifact + symbols file + manifest.
  • Add a build.sh --lite-backend path that sets -DDRAGONX_ENABLE_LITE_BACKEND=ON with the library/symbols/manifest paths (today --lite only sets DRAGONX_BUILD_LITE=ON).
  • Add a small fake SDXL .so (8 symbols, canned responses) + a CMake test target that links it; port the existing dry-dispatch tests toward real-symbol-table calls against the fake.
  • Retire the lite_bridge_runtime disabled-dispatch scaffolding per decision (1).
  • Exit demo / test: ObsidianDragonLite links the real artifact and currentWalletCapabilities() reports lite available; a test links the fake .so and the bridge reports available() and round-trips one execute() call.

M1 — Wallet lifecycle end-to-end (create / open / restore) ← highest-leverage spike

Goal: A user creates/opens/restores a real wallet from Settings.

Status (2026-06-04): implemented.

  • New src/wallet/lite_wallet_controller.{h,cpp} — App-owned, constructs LiteWalletLifecycleService with allowBridgeCalls=true, executes real create/open/restore, sodium_memzero-wipes seed/passphrase after each call, tracks walletOpen(), fires a persist callback on success. createLinked() uses LiteClientBridge::linkedSdxl(); an injecting ctor takes a fake bridge for tests.
  • Response-discard defect fixed — lite_wallet_lifecycle_service.cpp bridgeResult() now sets walletReady = nlohmann::json::accept(response) instead of hardcoded false.
  • App wiring — app.cpp::init() constructs the controller when supportsLiteBackend() with persist → settings_->save(); App::liteWallet() accessor (null in full-node / unlinked-lite).
  • UI reroute — settings_page.cpp "Validate" handler calls the controller for real when present (shows "Wallet ready"; wipes the UI seed/passphrase buffers after submit), else falls back to the validation-only adapter.
  • Tests — testLiteWalletControllerLifecycle() on the M0 fake harness: create/restore → ready + walletOpen() + persist fires + no owned-string leak; empty-seed rejected pre-backend; allowBridgeCalls=false and full-node caps fail closed; secret-wipe helper verified. ctest green. Builds verified: lite+backend (build/lite), lite-no-backend (build/linux), and full-node (build/fullnode, clean, 0 litelib symbols linked — no regression).
  • Deferred refinement: the controller is built at startup from saved connection settings, so a live in-session server change isn't honored until restart. Fold into M2/M3 (set the request's serverUrl from the field, or rebuild the controller on server-selection save).

Real-backend smoke (2026-06-04): PASS. Added tools/lite_smoke.cpp + a backend-guarded lite_smoke CMake target (built in build/lite). Against the live https://lite.dragonx.is: available()=true, walletExists(main)=false, checkServerOnline()=true, and initializeNew() created a real wallet (silentdragonxlite-wallet.dat, "Starting Mempool"), seed never logged, isolated HOME, exit 0. The create path works end-to-end on the real litelib, not just the fake.

Two findings from the smoke run:

  1. FIXED — chain-name bug. Our default chain name was the ticker "DRAGONX", but the SDXL backend hard-panic!s on anything outside {main,test,regtest} (lightclient.rs:166). Changed kDragonXLiteChainName and Settings::lite_chain_name_ to "main". Migration landed (settings.cpp load): any saved chain_name outside {main,test,regtest} is rewritten to "main" + flagged for re-save, so existing users with "DRAGONX" don't hit the panic. Covered by testLiteChainNameMigration().
  2. OPEN — panic-across-FFI aborts the app (hardening). The Rust backend uses panic! for error conditions; a panic across the C FFI boundary is UB and SIGABRTs the whole process (we saw a core dump). The C++ bridge cannot catch it. Before production: wrap the litelib FFI exports in std::panic::catch_unwind (rebuild the vendored lib) and/or validate all inputs before calling. Tracked in M5 (production enablement).
  • Add LiteWalletController owned by App (construct in App::init() when supportsLiteBackend()), owning LiteClientBridge::linkedSdxl() + LiteConnectionService + LiteWalletLifecycleService with allowBridgeCalls=true.
  • Reroute the Settings "Validate" button (settings_page.cpp:1661-1663evaluateLiteLifecycleRequestFromPageState → validation-only executeLiteWalletLifecycleUiRequest, :237) to the controller's real createWallet/openWallet/restoreWallet. The request structs are already populated correctly (settings_page.cpp:219-235).
  • Parse the lifecycle response JSON via lite_result_parsers and set wallet-ready state (fixes the discarded-response defect).
  • Persist server selection + wallet path on success (wire to config::Settings::save(); today settingsWriteRequested is blocked).
  • Zero seed/passphrase buffers after the bridge call (sodium_memzero) — fixes the secret-handling defect.
  • Exit demo / test: Against the fake backend (then a real one), create a wallet → status shows ready; reopen after restart; restore-from-seed succeeds. Fake-backend test asserts the lifecycle path calls the bridge and produces a ready state.

M2 — Sync loop + WalletState population

Goal: After open, balances/addresses/history populate the existing tabs.

Status (2026-06-04): data pipeline landed; live wiring (M2b) remains.

  • Last hop implemented + testedapplyLiteRefreshModelToWalletState(model, WalletState&) in lite_wallet_controller.{h,cpp}: zatoshi→DRGX balances, z/t address split, transaction typing + confirmations (chainHeight - blockHeight + 1), sync progress. Mutates WalletState in place (it's non-copyable). testLiteRefreshModelAppliesToWalletState() drives a bundle through the existing mapLiteWalletRefreshBundle → apply → asserts the populated WalletState. ctest green.
  • The fetch/parse/assemble pipeline already exists and works: LiteWalletGateway::refresh()LiteWalletRefreshBundlemapLiteWalletRefreshBundle()LiteWalletAppRefreshModel. M2 just needed the final → WalletState hop (above) plus live wiring.
  • M2b-1 — shared-bridge refactor (done). litelib is a global singleton and every LiteClientBridge calls litelib_shutdown() on destruction, so services must not each own one. LiteWalletLifecycleService, LiteWalletGateway, and LiteSyncService now take a non-owning LiteClientBridge*; LiteWalletController owns the single bridge and passes &bridge_. Builds clean in all configs; existing tests stay green.
  • M2b-2 — sync + controller refresh (done + tested). LiteSyncService::startSync now executes the sync command (was a stub). LiteWalletController gained startSync() (auto-invoked when a wallet becomes ready) and refreshWalletState(WalletState&) which polls syncstatus, runs gateway.refresh(), maps the bundle, and applies it into WalletState. testLiteWalletControllerRefreshPopulatesState() drives the full path against the real-shape fake (balances/addresses/transactions/sync populated; no-op when no wallet open). The fake harness now returns command-shaped JSON per tests/fixtures/lite/result_parsers.json. (Surfaced a real bug: info requires latest_block_height, and the gateway aborts the whole refresh on the first command's parse failure — fixed in the fake; worth noting the gateway's abort-on-first-failure is fragile against partial backend responses.)
  • M2b-3 — threaded App hook (done + tested). LiteWalletController owns a background worker (std::thread) that, once a wallet is ready, refreshes every ~4s and publishes a copyable LiteWalletAppRefreshModel under a mutex; App::update() calls takeRefreshedModel() and applies it into state_ on the main thread (WalletState is non-copyable, so the model crosses the thread boundary, not the state). Worker auto-starts on lifecycle-ready and is stopped+joined in the controller destructor. status_ is written only on the main thread to avoid races; walletOpen_/syncStarted_ are atomic. testLiteWalletControllerWorkerProducesModel() opens a wallet and asserts the worker publishes a populated model (stable across repeated runs). Builds clean in all configs. Real-backend refresh smoke (2026-06-04): ran lite_smoke --create --refresh against the live backend — found two real bugs the fake/fixture couldn't (smoke now links lite_result_parsers and runs each command's real output through the parser):
  1. FIXED — syncstatus parser mismatch. parseLiteSyncStatusResponse hard-required synced_blocks/total_blocks, but the real backend (per commands.rs:83-87) returns idle = {"syncing":"false"} (string!) and only in-progress = {"syncing":"true","synced_blocks":N,"total_blocks":M}. The parser now reads syncing as a string and treats the block fields as in-progress-only (idle → complete, synced/total 0). Covered by testLiteSyncStatusParserRealShapes() and verified against the live backend (syncstatus parse_ok=1). (info/balance/addresses parsers also verified OK against real output.)
  2. ADDRESSED — blocking, uninterruptible sync. The backend sync command runs do_sync(true), a blocking full scan that does not honor the shutdown flag (lightclient.rs), and balance/list block until synced. Redesign: the controller runs sync on a detached thread (never joined), the bridge is a std::shared_ptr shared with that thread (so detaching is safe and the bridge isn't litelib_shutdown'd while a sync still holds it), and startSync() is now non-blocking (was called on the main thread → would have frozen wallet creation). The joinable poll worker only issues fast syncstatus calls while syncing (publishing progress) and fetches balance/addresses/list once syncDone_ is set. Shutdown joins only the fast poll worker and detaches the sync thread → no hang. Verified deterministically by testLiteWalletControllerShutdownDoesNotHangDuringSync() (blocking-sync fake; destructor returns <1.5s) and the worker/refresh tests (stable across repeated runs).
  • Remaining for M2 polish: fix the syncstatus parser (above), address the blocking-sync/worker-shutdown issue (above), per-address balances (notes-correlation; currently aggregate-only), and harden the gateway's abort-on-first-failure (skip-and-continue per command).
  • Implement LiteSyncService::startSync (replace the "not implemented" stub) + a background worker polling syncstatus, mirroring NetworkRefreshService/RefreshScheduler (enqueue → worker → apply on main thread).
  • Drive LiteWalletGateway refresh (info/height/balance/addresses/notes/list/transactions) through lite_result_parserslite_wallet_state_mapperApp WalletState (privateBalance, transparentBalance, addresses, transactions, sync).
  • Hook the controller into App::update()'s refresh dispatch alongside (not inside) the full-node path.
  • Exit demo / test: After open, Balance/Receive/Transactions tabs show real lite data with no per-tab code changes (they already read app->state()); sync progress advances. Fake-backend test asserts a canned balance/tx set lands in WalletState.

M3 — Wallet data UI completeness

Goal: A complete read-only wallet UX.

  • Sync status/progress indicator; loading/empty states; new shielded/transparent address generation via the gateway (receive_tab); capability-gated surfaces verified (isUiSurfaceAvailable, settings_page.cpp:1494 lite/full branch).
  • Exit demo / test: Full read-only experience — balances, address book, history, live sync progress — against fake then real backend.

M4 — Send / import / export / shield

Goal: A user can spend and back up.

  • Wire send_tab (:780-787 sendTransaction) to litelib_execute (send/z_sendmany) via the gateway, with fee + confirmation UI, result parsing, and tx-status polling that updates WalletState.
  • Import (keys), export (wallet backup), shield (t→z).
  • Exit demo / test: Send a transaction and watch it confirm; export a backup. Fake-backend test drives a send and asserts the result/tx-status flow.

M5 — Persistence, recovery, packaging, production enablement

Goal: Shippable.

  • Wallet-file durability + crash/recovery + error/retry UX.
  • Lite release packaging (zip/AppImage/exe, no daemon assets — build.sh:285); build + (read-only-verify) sign the backend artifact in CI per lite-wallet-backend-signing-policy.
  • Lift macOS deferral; optionally implement the dynamic-loader sublane if shared-library distribution is required (today imported-link only).
  • Runtime kill-switch / feature flag / staged rollout.
  • Exit demo / test: A downloadable ObsidianDragonLite that creates, syncs, sends, and persists against a real backend.

What we explicitly drop from the v1 plan

  • The "promote one disabled scaffold at a time" methodology and all promotion → activation → post-closure → custody → handoff → stewardship → receipt governance layers.
  • Building disabled/typed-only versions of every stack layer ahead of real execution. (Vertical slices replace this.)
  • Further investment in lite_bridge_runtime dry-dispatch as an execution path.
  • "Readiness ceiling" / "Batch N" framing. Progress is measured by demoable capabilities, not batches.

We retain v1's ground rules, dependency ordering, and the artifact/ABI/signing reference docs (lite-wallet-backend-artifact-link-contract, -production, -signing-policy, -source-signature-plan).

Verification

  • Per milestone: a deterministic test linking the fake SDXL .so (no network) that asserts the new capability end-to-end, then a manual real-backend smoke test. Use /verify or /run to launch ObsidianDragonLite and confirm the exit demo.
  • Regression: cmake --build build/linux && (cd build/linux && ctest) stays green; full-node build + behavior unchanged.
  • Hygiene: scripts/check-source-hygiene.sh (pre-commit) keeps the churn from regrowing.

References

  • v1 (superseded/archived): docs/full-lite-wallet-implementation-plan-2026-05-18.md
  • ABI / artifact / signing: docs/lite-wallet-backend-artifact-link-contract-2026-05-18.md, docs/lite-wallet-backend-artifact-production-2026-05-18.md, docs/lite-wallet-backend-signing-policy-2026-05-22.md, docs/lite-wallet-backend-source-signature-plan-2026-05-20.md
  • Deferred runtime dynamic-loader design (only if M5 needs it): docs/lite-wallet-phase2-runtime-bridge-dynamic-loader-sublane-plan-2026-05-23.md, docs/lite-wallet-phase2-runtime-bridge-loading-symbol-resolution-plan-2026-05-22.md