Skip to content

Conversation

rosslagerwall
Copy link
Contributor

When emu-manager decides that xenguest has done enough in the live phase, it sets the state to live_done and then proceeds to suspend the VM. While the VM is executing its suspend process, xenguest continues to copy dirtied pages and this can slow down the suspend process.

In cases where vgpu is not used, it is preferable to send the pause command as soon as the live_done state is reached. This prevents excessive iterations from xenguest, lets the VM's suspend process run quickly, and the overall migration finish faster.

Measurements of the time from "xenguest live stage is done" to "Finished, send complete":

Before: 1020 ms, 1284 ms, 1263 ms
After: 439 ms, 470 ms, 431 ms

When emu-manager decides that xenguest has done enough in the live
phase, it sets the state to live_done and then proceeds to suspend the
VM. While the VM is executing its suspend process, xenguest continues to
copy dirtied pages and this can slow down the suspend process.

In cases where vgpu is not used, it is preferable to send the pause
command as soon as the live_done state is reached. This prevents
excessive iterations from xenguest, lets the VM's suspend process run
quickly, and the overall migration finish faster.

Measurements of the time from "xenguest live stage is done" to
"Finished, send complete":

Before: 1020 ms, 1284 ms, 1263 ms
After: 439 ms, 470 ms, 431 ms

Signed-off-by: Ross Lagerwall <[email protected]>
@TSnake41
Copy link

TSnake41 commented Oct 2, 2025

That seems contradictory to xenguest expectations
https://github.com/xenserver/xen.pg/blob/XS-8.4/patches/xenguest.patch#L125-L127

I am not sure what is actually correct; but I suppose xenopsd needs to acknowledge live migration before starting to actually live migrate for instance to make PV backends consistent (i.e avoiding them to write to guest memory while we actually live migrate).

@rosslagerwall
Copy link
Contributor Author

That seems contradictory to xenguest expectations https://github.com/xenserver/xen.pg/blob/XS-8.4/patches/xenguest.patch#L125-L127

I think that text is out-of-date. In recent times xenguest is controlled by emu-manager using a JSON-based protocol rather than talking to XAPI using the "strange" protocol described in the link.

I am not sure what is actually correct; but I suppose xenopsd needs to acknowledge live migration before starting to actually live migrate for instance to make PV backends consistent (i.e avoiding them to write to guest memory while we actually live migrate).

Memory changing during the live phase of the live migration expected and not something to be concerned about. Generally, to do a live migration you would:

  • Turn on log dirty to keep track of changed pages
  • Copy all the memory in multiple rounds (until there are no more pages or the guest is dirtying them too quickly)
  • Ask the guest politely to suspend
  • At this point we have the quiesced, non-live part of the live migration which should generally be minimized since the user would view it as downtime. During this time the remaining dirty pages are copied to the destination domain.
  • Finally, all the toolstack and storage stuff happens and the destination domain is resumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants