rpcserver: resolve root cause of premature wallet rescanning #10280

mohamedawnallah · 2025-10-08T16:57:54Z

Change Description

The original regression caused by this changeset 16a8b623b.

Closes #10180.
Closes #9081.

Steps to Test

Steps for reviewers to follow to test the change.

Pull Request Checklist

Testing

Your PR passes all CI checks.
Tests covering the positive and negative (error paths) are included.
Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

The change is not insubstantial. Typo fixes are not accepted to fight bot spam.
The change obeys the Code Documentation and Commenting guidelines, and lines wrap at 80.
Commits follow the Ideal Git Commit Structure.
Any new logging statements use an appropriate subsystem and logging level.
Any new lncli commands have appropriate tags in the comments for the rpc in the proto file.
There is a change description in the release notes, or [skip ci] in the commit message for small changes.

📝 Please see our Contribution Guidelines for further guidance.

gemini-code-assist · 2025-10-08T16:58:10Z

Summary of Changes

Hello @mohamedawnallah, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This PR optimizes neutrino rescans by integrating the wallet's birthday into the rescan logic, significantly speeding up blockchain synchronization for newly created wallets. It modifies neutrinonotify and chainview components to accept and utilize a walletBirthday parameter, which is then passed as a neutrino.StartTime option to limit the rescan range. Associated test files are updated to reflect these changes.

Highlights

Intent: This pull request aims to optimize the neutrino rescan process by leveraging the wallet's creation date (birthday). By passing the wallet birthday to the rescan, the system can avoid scanning blocks that predate the wallet's existence, thereby fixing slow blockchain sync issues, especially when a wallet is created before the neutrino client has completed its header synchronization. This addresses the problem described in issue [bug]: wallet creation slows sync if run before neutrino headers complete #10180.
Changes: The core change involves propagating the wallet's birthday (time.Time) through various components that initiate neutrino rescans. This includes modifications to the neutrinonotify package for chain notifications and the chainview package for filtered chain views. The neutrino.StartTime rescan option is now utilized to limit the scan range based on the wallet's birthday. Test files have also been updated to accommodate the new walletBirthday parameter, typically setting it to the Unix epoch for full historical scanning in tests.
Reviewer Activity: No specific reviewer activity has been recorded or provided in the context for this pull request.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an optimization for neutrino rescan by using the wallet's birthday. The changes are well-contained and the logic is sound. I've identified a few areas for improvement, mainly related to code formatting and documentation, to align with the repository's style guide.

chainntnfs/neutrinonotify/driver.go

chainntnfs/neutrinonotify/neutrino.go

chainntnfs/test/test_interface.go

chainreg/chainregistry.go

routing/chainview/interface_test.go

routing/chainview/neutrino.go

Roasbeef · 2025-10-08T20:05:55Z

Can we get a brief recap in the OP re: what happened, was it a regression, why this fixes it?

Another follow up: if the regression was introduced in btcwallet, is that a better way to fix it?

How can we develop a regression test for this issue?

mohamedawnallah · 2025-10-09T14:46:53Z

Hi @shocknet-justin, Would it be possible to try this potential patch if that resolves your issue (#10180)? I can try to write a reproducible test (admittedly not the easiest thing to do) but having +1/-1 that this changeset works/not works can be helpful

hsjoberg · 2025-10-09T22:18:12Z

Thank you for looking into this one @mohamedawnallah. I believe I'm experiencing the same issue as @shocknet-justin.

I tried running your patches, but it seems like there's still weird behavior on first run.

2025-10-10 01:47:08.475 [INF] BTCN blockmanager.go:1163: Successfully got filter headers for 918 checkpoints
2025-10-10 01:47:08.475 [INF] BTCN blockmanager.go:643: Fully caught up with cfheaders at height 918000, waiting at tip for new blocks
2025-10-10 01:47:08.476 [INF] BTCN blockmanager.go:733: Attempting to fetch set of un-checkpointed filters at height=918354, hash=00000000000000000001ac0a0f9ad634af778a391b799b80ff644c851d378aa1
2025-10-10 01:47:18.483 [INF] BTCN headerlogger.go:67: Verified 96000 filter headers in the last 12.72s (height 918001, 2025-10-07 10:32:03 +0400 +04)
2025-10-10 01:47:18.483 [INF] BTCN manager.go:327: Canceling block subscription: id=2
2025-10-10 01:47:18.572 [INF] NTFN neutrino.go:693: New block: height=14001, sha=00000000b44efc0402c4841e13c29e2843776abd8ca1a728b4d12a092966e87a
2025-10-10 01:47:18.661 [INF] NTFN neutrino.go:693: New block: height=14002, sha=000000001e4c88a360df5e1f5b4f0b91676dc552acb2343f71068d4b464801f7
2025-10-10 01:47:18.989 [INF] NTFN neutrino.go:693: New block: height=14003, sha=0000000095f738799810fbf31566faac20087ffb00598a35a02ce2166e387117
2025-10-10 01:47:19.321 [INF] NTFN neutrino.go:693: New block: height=14004, sha=0000000080863585423586350af531c858597cb5b9e59f0ea3ff6d63678f62e6
[Continues like this...]

The issue can be seen with:

./lnd-debug --lnddir="./lnddir" --noseedbackup --nolisten --bitcoin.active --bitcoin.mainnet --bitcoin.node=neutrino --feeurl="https://nodes.lightning.computer/fees/v1/btc-fee-estimates.json" --routing.assumechanvalid --tlsdisableautofill --neutrino.addpeer=asia.blixtwallet.com --neutrino.addpeer=europe.blixtwallet.com --neutrino.addpeer=btcd.lnolymp.us --neutrino.addpeer=btcd-mainnet.lightning.computer

Let me know if I can help you or perhaps test something.

shocknet-justin · 2025-10-09T22:29:47Z

I've got some time sensitive materials I need to finish today but will test in the next day or two. It looks like Hampus is still reproducing it however, the --noseedbackup flag I believe should be triggering the wallet create as quickly as my deployment scripts would.

mohamedawnallah · 2025-10-12T01:27:46Z

Benchmarking Report

With the changeset in this PR

We bring LND v0.20.0 latest release candidate to normal initial sync times that was previously taking days to complete under ideal environmental conditions now takes just over a minute.

_{Execution Time (seconds)}	_{Peak Memory (MB)}	_{CPU Usage (%)}

Benchmarks in Raw Format

dev@dev:~/lnd$ sudo ./compare_lnd_versions.sh
==========================================
LND Version Comparison Benchmark
==========================================
Comparing 4 versions:
  - v0.18.3-beta
  - v0.18.4-beta
  - v0.18.5-beta
  - v0.20.0-beta.rc1-after

Each version will run in isolated namespaces
for fair comparison.

>>> Running v0.18.3-beta (1/4)...
==========================================
Benchmarking LND v0.18.3-beta
Using temporary storage: /tmp/lnd-bench-v0.18.3-beta-tQ9SWS
Resource Limits:
  - CPU: 4 cores
  - Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 851327)
Monitoring sync status...

To manually check status, run:
  sudo nsenter --target 851327 --pid --mount --uts --ipc \
    /home/dev/lnd/lncli-debug-v0.18.3-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo

[Sample 0] Block: unknown | synced: false | Memory: 348.76 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 437.65 MB | CPU: 223.00%
[Sample 2] Block: 150000 | synced: false | Memory: 119.67 MB | CPU: 190.00%
[Sample 3] Block: 243000 | synced: false | Memory: 131.54 MB | CPU: 158.00%
[Sample 4] Block: 337000 | synced: false | Memory: 150.61 MB | CPU: 171.00%
[Sample 5] Block: 412000 | synced: false | Memory: 160.77 MB | CPU: 151.00%
[Sample 6] Block: 464000 | synced: false | Memory: 174.42 MB | CPU: 143.00%
[Sample 7] Block: 536000 | synced: false | Memory: 193.35 MB | CPU: 157.00%
[Sample 8] Block: 635000 | synced: false | Memory: 207.39 MB | CPU: 167.00%
[Sample 9] Block: 700000 | synced: false | Memory: 217.50 MB | CPU: 153.00%
[Sample 10] Block: 764000 | synced: false | Memory: 242.05 MB | CPU: 154.00%
[Sample 11] Block: 822000 | synced: false | Memory: 253.53 MB | CPU: 152.00%
[Sample 12] Block: 914000 | synced: false | Memory: 278.78 MB | CPU: 144.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...

========================================
BENCHMARK RESULTS - v0.18.3-beta
========================================
Execution Time: 70.434696628 seconds
Peak Memory Usage: 437.65 MB
Average Memory Usage: 231.76 MB
Average CPU Usage: 157.76%
Samples Collected: 14

Results saved to: benchmark_v0.18.3-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.3-beta-tQ9SWS

Waiting 10 seconds before next benchmark...

>>> Running v0.18.4-beta (2/4)...
==========================================
Benchmarking LND v0.18.4-beta
Using temporary storage: /tmp/lnd-bench-v0.18.4-beta-2Z0bCV
Resource Limits:
  - CPU: 4 cores
  - Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 852086)
Monitoring sync status...

To manually check status, run:
  sudo nsenter --target 852086 --pid --mount --uts --ipc \
    /home/dev/lnd/lncli-debug-v0.18.4-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo

[Sample 0] Block: unknown | synced: false | Memory: 363.35 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 454.38 MB | CPU: 240.00%
[Sample 2] Block: 197000 | synced: false | Memory: 118.01 MB | CPU: 183.00%
[Sample 3] Block: 275000 | synced: false | Memory: 132.96 MB | CPU: 158.00%
[Sample 4] Block: 351000 | synced: false | Memory: 150.52 MB | CPU: 153.00%
[Sample 5] Block: 432000 | synced: false | Memory: 163.67 MB | CPU: 158.00%
[Sample 6] Block: 508000 | synced: false | Memory: 178.56 MB | CPU: 164.00%
[Sample 7] Block: 583000 | synced: false | Memory: 190.03 MB | CPU: 156.00%
[Sample 8] Block: 635000 | synced: false | Memory: 209.26 MB | CPU: 148.00%
[Sample 9] Block: 705000 | synced: false | Memory: 226.96 MB | CPU: 151.00%
[Sample 10] Block: 791000 | synced: false | Memory: 248.11 MB | CPU: 161.00%
[Sample 11] Block: 816000 | synced: false | Memory: 255.58 MB | CPU: 132.00%
[Sample 12] Block: 918000 | synced: false | Memory: 251.60 MB | CPU: 132.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...

========================================
BENCHMARK RESULTS - v0.18.4-beta
========================================
Execution Time: 70.358434022 seconds
Peak Memory Usage: 454.38 MB
Average Memory Usage: 234.59 MB
Average CPU Usage: 157.61%
Samples Collected: 14

Results saved to: benchmark_v0.18.4-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.4-beta-2Z0bCV

Waiting 10 seconds before next benchmark...

>>> Running v0.18.5-beta (3/4)...
==========================================
Benchmarking LND v0.18.5-beta
Using temporary storage: /tmp/lnd-bench-v0.18.5-beta-XHFt2i
Resource Limits:
  - CPU: 4 cores
  - Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 852853)
Monitoring sync status...

To manually check status, run:
  sudo nsenter --target 852853 --pid --mount --uts --ipc \
    /home/dev/lnd/lncli-debug-v0.18.5-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo

[Sample 0] Block: unknown | synced: false | Memory: 352.79 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 475.08 MB | CPU: 244.00%
[Sample 2] Block: 207000 | synced: false | Memory: 116.33 MB | CPU: 180.00%
[Sample 3] Block: 256000 | synced: false | Memory: 138.87 MB | CPU: 144.00%
[Sample 4] Block: 347000 | synced: false | Memory: 156.42 MB | CPU: 163.00%
[Sample 5] Block: 434000 | synced: false | Memory: 166.09 MB | CPU: 163.00%
[Sample 6] Block: 514000 | synced: false | Memory: 179.01 MB | CPU: 161.00%
[Sample 7] Block: 571000 | synced: false | Memory: 190.89 MB | CPU: 150.00%
[Sample 8] Block: 635000 | synced: false | Memory: 208.35 MB | CPU: 143.00%
[Sample 9] Block: 721000 | synced: false | Memory: 221.60 MB | CPU: 163.00%
[Sample 10] Block: 795000 | synced: false | Memory: 237.88 MB | CPU: 162.00%
[Sample 11] Block: 854000 | synced: false | Memory: 262.40 MB | CPU: 156.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...

========================================
BENCHMARK RESULTS - v0.18.5-beta
========================================
Execution Time: 65.119188935 seconds
Peak Memory Usage: 475.08 MB
Average Memory Usage: 230.24 MB
Average CPU Usage: 162.00%
Samples Collected: 13

Results saved to: benchmark_v0.18.5-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.5-beta-XHFt2i

Waiting 10 seconds before next benchmark...

>>> Running v0.20.0-beta.rc1-after (4/4)...
==========================================
Benchmarking LND v0.20.0-beta.rc1-after
Using temporary storage: /tmp/lnd-bench-v0.20.0-beta.rc1-after-LeHuC4
Resource Limits:
  - CPU: 4 cores
  - Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 854003)
Monitoring sync status...

To manually check status, run:
  sudo nsenter --target 854003 --pid --mount --uts --ipc \
    /home/dev/lnd/lncli-debug-v0.20.0-beta.rc1-after --lnddir=/tmp/lnd_data_ns/lnd getinfo

[Sample 0] Block: unknown | synced: false | Memory: 356.35 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 489.78 MB | CPU: 249.00%
[Sample 2] Block: 207000 | synced: false | Memory: 144.97 MB | CPU: 222.00%
[Sample 3] Block: 278000 | synced: false | Memory: 172.72 MB | CPU: 203.00%
[Sample 4] Block: 355000 | synced: false | Memory: 207.43 MB | CPU: 221.00%
[Sample 5] Block: 436000 | synced: false | Memory: 242.13 MB | CPU: 224.00%
[Sample 6] Block: 510000 | synced: false | Memory: 269.44 MB | CPU: 216.00%
[Sample 7] Block: 577000 | synced: false | Memory: 303.05 MB | CPU: 222.00%
[Sample 8] Block: 657000 | synced: false | Memory: 324.72 MB | CPU: 228.00%
[Sample 9] Block: 721000 | synced: false | Memory: 354.95 MB | CPU: 214.00%
[Sample 10] Block: 771000 | synced: false | Memory: 378.89 MB | CPU: 208.00%
[Sample 11] Block: 795000 | synced: false | Memory: 405.87 MB | CPU: 183.00%
[Sample 12] Block: 872000 | synced: false | Memory: 392.42 MB | CPU: 227.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...

========================================
BENCHMARK RESULTS - v0.20.0-beta.rc1-after
========================================
Execution Time: 70.416534861 seconds
Peak Memory Usage: 489.78 MB
Average Memory Usage: 308.40 MB
Average CPU Usage: 208.15%
Samples Collected: 14

Results saved to: benchmark_v0.20.0-beta.rc1-after.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.20.0-beta.rc1-after-LeHuC4

==========================================
COMPARISON RESULTS
==========================================

Metric                    | v0.18.3-beta      | v0.18.4-beta      | v0.18.5-beta      | v0.20.0-beta.rc1-after
------------------------- | ----------------- | ----------------- | ----------------- | -----------------
Execution Time (s)        | 70.434696628      | 70.358434022      | 65.119188935      | 70.416534861
Peak Memory (MB)          | 437.65            | 454.38            | 475.08            | 489.78
Average Memory (MB)       | 231.76            | 234.59            | 230.24            | 308.4
Average CPU (%%)          | 157.76            | 157.61            | 162               | 208.15

Summary:
--------
✓ Fastest: v0.18.5-beta (65.119188935s)
✓ Lowest Peak Memory: v0.18.3-beta (437.65 MB)
✓ Lowest Average CPU: v0.18.4-beta (157.61%)

Detailed logs available in:
  lnd_v0.18.3-beta.log
  lnd_v0.18.4-beta.log
  lnd_v0.18.5-beta.log
  lnd_v0.20.0-beta.rc1-after.log

Conclusions

Execution time for v0.20.0-beta.rc1 with the changeset in this PR remains competitive at 70.42 seconds, matching the non-regression performance of v0.18.3-beta and v0.18.4-beta (70.43s and 70.36s respectively) while being only 8.1% slower than v0.18.5-beta (65.12s). Most importantly, this represents a dramatic recovery from the multi-day sync times that plagued v0.19.0-beta, v0.19.1-beta, v0.19.2-beta, v0.19.3-beta, and v0.20.0 current release candidates - what previously took days now completes in just over a minute, making LND v0.20.0 fully operational again.

Constraints

Each LND Benchmark runs its own isolated linux namespace with all same upper bound for cgroup resources for fair and representative results

Reproducibility

Choosing the target release branches to benchmark against, build LND in debug mode using make build, and then tweak compare_lnd_versions.sh script to match generated artifacts names. Then you would be able to run it and get those benchmarking results. Seeing the benchmarks in raw format section above for perhaps a more verbose helpful output

Resources

compare_lnd_versions.sh: https://gist.github.com/mohamedawnallah/7f77cfe3e34a729c549e4d7c9627ce87#file-compare_lnd_versions-sh
benchmark_lnd_namespace.sh: https://gist.github.com/mohamedawnallah/7f77cfe3e34a729c549e4d7c9627ce87#file-benchmark_lnd_namespace-sh
visualize_benchmarks.py: https://gist.github.com/mohamedawnallah/7f77cfe3e34a729c549e4d7c9627ce87#file-visualize_benchmarks-py

mohamedawnallah · 2025-10-12T02:18:17Z

cc @hsjoberg, @shocknet-justin Pushed a new changeset and it will solve the wallet sync issues you experienced and this one as well #9081

shocknet-justin · 2025-10-12T19:15:03Z

@mohamedawnallah I think this partially solved the issue, in that lnd reports syncd_to_chain=true after headers complete, which should unblock our deployments, but what I understand to be the rescan is still taking place after headers sync:

mohamedawnallah · 2025-10-12T23:43:06Z

@mohamedawnallah I think this partially solved the issue, in that lnd reports syncd_to_chain=true after headers complete, which should unblock our deployments, but what I understand to be the rescan is still taking place after headers sync:

It took me two hours to identify the underlying root cause. The changeset I submitted was addressing the symptom rather than the core issue. I'll submit a new changeset tomorrow that fully resolves the root cause. Thanks for being responsive during the resolution process!

yyforyongyu · 2025-10-13T10:27:14Z

rpcserver.go

+	// Sync state transitions as dispatcher catches up:
+	//   Gap >10 blocks:  synced (ignore dispatcher lag during catchup)
+	//   Gap 1-10 blocks: not synced (strict: must be at exact tip)
+	//   Gap 0 blocks:    synced (dispatcher caught up)


blockbeatDispatcher is receiving blocks and dispatching them to blockbeat subscribers, I don't think it's safe to assume we've synced. It's likely that one of the subsystems is still busy processing the block, causing the blockbeat to stall. We should instead identify what's causing the stall fix properly fix it, other than giving user the false impression that lnd is synced.

mohamedawnallah · 2025-10-13T23:30:46Z

Okay today's experiments haven't come to a meaningful conclusion fix. I am 100% confident that is the main changeset that introduced this regression is this one: 16a8b623b. I guess it is more of race condition/not up-to-date height value after the chain notifier moved from its place to there.

Building ANY LND before that commit will NOT TRIGGER any premature wallet rescanning.
Building ANY LND during/after that commit will TRIGGER the premature wallet rescanning.

Will try tomorrow with different experiments :)

mohamedawnallah · 2025-10-15T20:45:16Z

Alrighty, I have submitted a new changeset and It will solve the premature rescanning issue. Something you will notice for now when trying this is that the LND RPC server will not start till the headers will be fully synced to chain that means issuing any RPC calls/commands against LND won't be responsive while headers syncing. That can be noticed on first run, on subsequent runs is negligble unless there is a big timing difference between those runs.

I will put later today a root cause analysis report that will help answering the following genuine questions:

What triggers this premature rescanning?
Why when I restart LND that premature rescanning not happening?
Is really syncing headers before starting RPC server necessary? if not what needs to happen to allow this (if any input)? If it is indeed necessary how can the gap between neutrino p2p headers sync and starting LND RPC server be significantly reduced (on the first LND run)?
How using other chain backends e.g btcd, bitcoind would affect the gap between initial sync and LND RPC server startup?
What are the LND releases impacted by this?

For now you can experiment with this changeset if that solves your wallet sync issues

cc @shocknet-justin, @hsjoberg

shocknet-justin · 2025-10-15T21:31:27Z

Our workaround is already to no not create until its done, but that's not ideal because its slowing the on-boarding flow relative to prior versions, though its not the end of the world either since its usually only by a minute or two

The lack of rpc at all is a bigger problem since we have to parse logs to know how things are progressing, otherwise we're polling into the void... and both options can be flakey

An rpc for the state that is available immediately would solve the latter problem

mohamedawnallah · 2025-10-15T23:17:35Z

I understand this is not the ideal solution. We are iterating on what the ideal solution could look like. Our process has been:

First, we verified whether this is an existing issue for current users
Then, we identified and analyzed whether this is a regression, and if so, what the root cause might be
Now, we are evaluating intentional changes to resolve that root cause

Why does the current changeset move headers syncing before server start?

Let me explain by comparing the old behavior (non-regression) with the new behavior.

Old Behavior (Non-Regression)

We can use commit 1dfb5a0 as a representative example of the LND version where premature rescanning was not happening.

In that version, the sequence was:

RPC server starts first
Full syncing happens as part of the daemon server
After headers are fully synced, the server calls Chain Notifier, which starts the Neutrino rescanning process
Rescanning begins from the current height (which is confident to be recent since we just synced)

References:

lnd/lnd.go

Lines 694 to 751 in 1dfb5a0

    
           type syncResult struct { 
        
           	synced        bool 
        
           	bestBlockTime int64 
        
           	err           error 
        
           } 
        
           var syncedResChan = make(chan syncResult, 1) 
        
           for { 
        
           	// We check if the wallet is synced in a separate goroutine as 
        
           	// the call is blocking, and we want to be able to interrupt it 
        
           	// if the daemon is shutting down. 
        
           	go func() { 
        
           		synced, bestBlockTime, err := activeChainControl.Wallet. 
        
           			IsSynced() 
        
           		syncedResChan <- syncResult{synced, bestBlockTime, err} 
        
           	}() 
        
           	select { 
        
           	case <-interceptor.ShutdownChannel(): 
        
           		return nil 
        
           	case res := <-syncedResChan: 
        
           		if res.err != nil { 
        
           			return mkErr("unable to determine if wallet "+ 
        
           				"is synced", res.err) 
        
           		} 
        
           		ltndLog.DebugS(ctx, "Syncing to block chain", 
        
           			"best_block_time", time.Unix(res.bestBlockTime, 0), 
        
           			"is_synced", res.synced) 
        
           		if res.synced { 
        
           			break 
        
           		} 
        
           		// If we're not yet synced, we'll wait for a second 
        
           		// before checking again. 
        
           		select { 
        
           		case <-interceptor.ShutdownChannel(): 
        
           			return nil 
        
           		case <-time.After(time.Second): 
        
           			continue 
        
           		} 
        
           	} 
        
           	break 
        
           } 
        
           _, bestHeight, err = activeChainControl.ChainIO.GetBestBlock() 
        
           if err != nil { 
        
           	return mkErr("unable to determine chain tip", err) 
        
           } 
        
           ltndLog.InfoS(ctx, "Chain backend is fully synced!", 
        
           	"end_height", bestHeight)

lnd/server.go

Lines 2129 to 2134 in 1dfb5a0

    
           cleanup = cleanup.add(s.cc.ChainNotifier.Stop) 
        
           if err := s.cc.ChainNotifier.Start(); err != nil { 
        
           	startErr = err 
        
           	return 
        
           }

New Behavior (Regression)

With commit 16a8b62, the order of operations is FLIPPED:

Chain Notifier (which triggers Neutrino rescanning) is called first in newServer, before starting the RPC server
Wallet syncing happens after the RPC server starts

This means the height that Chain Notifier accesses comes from BuildChainControl, NOT from the last synced height. That's why we're seeing rescanning start from ~100,000 headers.

References:

lnd/lnd.go

Lines 496 to 502 in 6ade31d

    
           activeChainControl, cleanUp, err := implCfg.BuildChainControl( 
        
           	partialChainControl, walletConfig, 
        
           ) 
        
           if err != nil { 
        
           	return mkErr("error loading chain control", err) 
        
           }

lnd/server.go

Lines 736 to 747 in 6ade31d

    
           // Start the low-level services once they are initialized. 
        
           // 
        
           // TODO(yy): break the server startup into four steps, 
        
           // 1. init the low-level services. 
        
           // 2. start the low-level services. 
        
           // 3. init the high-level services. 
        
           // 4. start the high-level services. 
        
           if err := s.startLowLevelServices(); err != nil { 
        
           	return nil, err 
        
           } 
        
           currentHash, currentHeight, err := s.cc.ChainIO.GetBestBlock()

Why Not Return to the Old Order?

A natural question: Why not sync first, then call Chain Notifier rescanning in a separate goroutine so it has access to recent height (i.e., move it out of newServer like the earlier variant)? That way, premature rescanning would be avoided and the RPC server wouldn't be blocked.

That's a good question. I don't really know the answer, and I may need @yyforyongyu's input on this. If it turns out that s.cc.ChainNotifier.Start() must be called before the server starts (like seen down below as currently in the codebase), then perhaps there is no other way than syncing headers first (meaning the RPC server wouldn't start yet), then starting the server so Neutrino rescanning gets the recent synced height. Otherwise, this premature rescanning will continue to happen.

Reference:

lnd/server.go

Lines 2128 to 2150 in 6ade31d

    
           // startLowLevelServices starts the low-level services of the server. These 
        
           // services must be started successfully before running the main server. The 
        
           // services are, 
        
           // 1. the chain notifier. 
        
           // 
        
           // TODO(yy): identify and add more low-level services here. 
        
           func (s *server) startLowLevelServices() error { 
        
           	var startErr error 
        
           	cleanup := cleaner{} 
        
           	cleanup = cleanup.add(s.cc.ChainNotifier.Stop) 
        
           	if err := s.cc.ChainNotifier.Start(); err != nil { 
        
           		startErr = err 
        
           	} 
        
           	if startErr != nil { 
        
           		cleanup.run() 
        
           	} 
        
           	return startErr 
        
           }

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

mohamedawnallah force-pushed the optimize-neutrino-rescan branch from 0f26e5b to dae9086 Compare October 8, 2025 17:12

saubyk assigned mohamedawnallah Oct 8, 2025

saubyk added this to the v0.20.0 milestone Oct 8, 2025

saubyk added this to lnd v0.20 Oct 8, 2025

saubyk moved this to In progress in lnd v0.20 Oct 8, 2025

saubyk moved this from In progress to In review in lnd v0.20 Oct 9, 2025

saubyk moved this from In review to In progress in lnd v0.20 Oct 9, 2025

mohamedawnallah force-pushed the optimize-neutrino-rescan branch from dae9086 to 16e24f1 Compare October 12, 2025 01:27

mohamedawnallah changed the title ~~neutrinonotify+chainview: optimize neutrino rescan with wallet birthday~~ rpcserver: avoid false negatives in dispatcher during the initial sync Oct 12, 2025

yyforyongyu requested changes Oct 13, 2025

View reviewed changes

mohamedawnallah changed the title ~~rpcserver: avoid false negatives in dispatcher during the initial sync~~ rpcserver: resolve root cause of premature wallet rescanning Oct 14, 2025

lnd.go: make sure the chain are synced before starting the server

a93c2d3

mohamedawnallah force-pushed the optimize-neutrino-rescan branch from 16e24f1 to a93c2d3 Compare October 15, 2025 20:26

mohamedawnallah requested a review from yyforyongyu October 15, 2025 23:26

rpcserver: resolve root cause of premature wallet rescanning #10280

Are you sure you want to change the base?

rpcserver: resolve root cause of premature wallet rescanning #10280

Conversation

mohamedawnallah commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Description

Steps to Test

Pull Request Checklist

Testing

Code Style and Documentation

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Roasbeef commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mohamedawnallah commented Oct 9, 2025 • edited by saubyk Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hsjoberg commented Oct 9, 2025

Uh oh!

shocknet-justin commented Oct 9, 2025

Uh oh!

mohamedawnallah commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarking Report

With the changeset in this PR

Constraints

Reproducibility

Resources

Uh oh!

mohamedawnallah commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shocknet-justin commented Oct 12, 2025

Uh oh!

mohamedawnallah commented Oct 12, 2025

Uh oh!

yyforyongyu Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

mohamedawnallah commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mohamedawnallah commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shocknet-justin commented Oct 15, 2025

Uh oh!

mohamedawnallah commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why does the current changeset move headers syncing before server start?

Old Behavior (Non-Regression)

New Behavior (Regression)

Why Not Return to the Old Order?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mohamedawnallah commented Oct 8, 2025 •

edited

Loading

Roasbeef commented Oct 8, 2025 •

edited

Loading

mohamedawnallah commented Oct 9, 2025 •

edited by saubyk

Loading

mohamedawnallah commented Oct 12, 2025 •

edited

Loading

mohamedawnallah commented Oct 12, 2025 •

edited

Loading

mohamedawnallah commented Oct 13, 2025 •

edited

Loading

mohamedawnallah commented Oct 15, 2025 •

edited

Loading

mohamedawnallah commented Oct 15, 2025 •

edited

Loading