Message ID | 20201002162747.3123597-1-berrange@redhat.com |
---|---|
Headers | show |
Series | migration: bring improved savevm/loadvm/delvm to QMP | expand |
On Mon, Oct 05, 2020 at 09:26:54AM +0200, Markus Armbruster wrote: > Eric Blake <eblake@redhat.com> writes: > > > On 10/2/20 11:27 AM, Daniel P. Berrangé wrote: > > > > Do we have a query- command handy to easily learn which snapshot names > > are even available to attempt deletion on? If not, that's worth a > > separate patch. > > Oh, I missed that one. It's the QMP equivalent to "info snapshots", and > it is required to finish the job. Since we're at v5, I'd be okay with a > follow-up patch, as long as it is done for 5.2. "query-named-block-nodes" returns BlockDeviceInfo struct, which contains ImageInfo which contains an array of SnapshotInfo. So we don't need any new query command. Regards, Daniel
Daniel P. Berrangé <berrange@redhat.com> writes: > On Mon, Oct 05, 2020 at 09:26:54AM +0200, Markus Armbruster wrote: >> Eric Blake <eblake@redhat.com> writes: >> >> > On 10/2/20 11:27 AM, Daniel P. Berrangé wrote: >> > >> > Do we have a query- command handy to easily learn which snapshot names >> > are even available to attempt deletion on? If not, that's worth a >> > separate patch. >> >> Oh, I missed that one. It's the QMP equivalent to "info snapshots", and >> it is required to finish the job. Since we're at v5, I'd be okay with a >> follow-up patch, as long as it is done for 5.2. > > "query-named-block-nodes" returns BlockDeviceInfo struct, which > contains ImageInfo which contains an array of SnapshotInfo. So > we don't need any new query command. My Acked-by stands without a new query then.
* Eric Blake (eblake@redhat.com) wrote: > On 10/2/20 11:27 AM, Daniel P. Berrangé wrote: > > savevm, loadvm and delvm are some of the few HMP commands that have never > > been converted to use QMP. The reasons for the lack of conversion are > > that they blocked execution of the event thread, and the semantics > > around choice of disks were ill-defined. > > > > Despite this downside, however, libvirt and applications using libvirt > > have used these commands for as long as QMP has existed, via the > > "human-monitor-command" passthrough command. IOW, while it is clearly > > desirable to be able to fix the problems, they are not a blocker to > > all real world usage. > > > > Meanwhile there is a need for other features which involve adding new > > parameters to the commands. This is possible with HMP passthrough, but > > it provides no reliable way for apps to introspect features, so using > > QAPI modelling is highly desirable. > > > > This patch thus introduces new snapshot-{load,save,delete} commands to > > QMP that are intended to replace the old HMP counterparts. The new > > commands are given different names, because they will be using the new > > QEMU job framework and thus will have diverging behaviour from the HMP > > originals. It would thus be misleading to keep the same name. > > > > While this design uses the generic job framework, the current impl is > > still blocking. The intention that the blocking problem is fixed later. > > None the less applications using these new commands should assume that > > they are asynchronous and thus wait for the job status change event to > > indicate completion. > > > > In addition to using the job framework, the new commands require the > > caller to be explicit about all the block device nodes used in the > > snapshot operations, with no built-in default heuristics in use. > > > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > > --- > > > +++ b/qapi/job.json > > @@ -22,10 +22,17 @@ > > # > > # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1) > > # > > +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2) > > +# > > +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2) > > +# > > +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since 5.2) > > +# > > # Since: 1.7 > > ## > > { 'enum': 'JobType', > > - 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] } > > + 'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend', > > + 'snapshot-load', 'snapshot-save', 'snapshot-delete'] } > > > > ## > > # @JobStatus: > > diff --git a/qapi/migration.json b/qapi/migration.json > > index 7f5e6fd681..d2bd551ad9 100644 > > --- a/qapi/migration.json > > +++ b/qapi/migration.json > > @@ -1787,3 +1787,123 @@ > > # Since: 5.2 > > ## > > { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' } > > + > > +## > > +# @snapshot-save: > > +# > > +# Save a VM snapshot > > +# > > +# @job-id: identifier for the newly created job > > +# @tag: name of the snapshot to create > > +# @devices: list of block device node names to save a snapshot to > > +# @vmstate: block device node name to save vmstate to > > Here, you document vmstate last,... > > > +# > > +# Applications should not assume that the snapshot save is complete > > +# when this command returns. The job commands / events must be used > > +# to determine completion and to fetch details of any errors that arise. > > +# > > +# Note that the VM CPUs will be paused during the time it takes to > > +# save the snapshot > > "will be", or "may be"? As you stated above, we may be able to lift the > synchronous limitations down the road, while still maintaining the > present interface of using this command to start the job and waiting on > the job id until it is finished, at which point the CPUs might not need > to be paused as much. > > > +# > > +# It is strongly recommended that @devices contain all writable > > +# block device nodes if a consistent snapshot is required. > > +# > > +# If @tag already exists, an error will be reported > > +# > > +# Returns: nothing > > +# > > +# Example: > > +# > > +# -> { "execute": "snapshot-save", > > +# "data": { > > +# "job-id": "snapsave0", > > +# "tag": "my-snap", > > +# "vmstate": "disk0", > > +# "devices": ["disk0", "disk1"] > > ...here vmstate occurs before devices. I don't know if our doc > generator cares about inconsistent ordering. > > > +# } > > +# } > > +# <- { "return": { } } > > +# > > +# Since: 5.2 > > +## > > +{ 'command': 'snapshot-save', > > + 'data': { 'job-id': 'str', > > + 'tag': 'str', > > + 'vmstate': 'str', > > + 'devices': ['str'] } } > > + > > +## > > +# @snapshot-load: > > +# > > +# Load a VM snapshot > > +# > > +# @job-id: identifier for the newly created job > > +# @tag: name of the snapshot to load. > > +# @devices: list of block device node names to load a snapshot from > > +# @vmstate: block device node name to load vmstate from > > +# > > +# Applications should not assume that the snapshot save is complete > > +# when this command returns. The job commands / events must be used > > +# to determine completion and to fetch details of any errors that arise. > > s/save/load/ > > > +# > > +# Note that the VM CPUs will be paused during the time it takes to > > +# save the snapshot > > s/save/load/ > > But while pausing CPUs during save is annoying, pausing CPUs during > restore makes sense (after all, executing on stale data that will still > be updated during the restore is just wasted execution). Note that there are other snapshotting schemes that can do this more dynamically and page/load the state on demand - a rapid resume from snapshot like that is quite attractive. Dave > > > +# > > +# It is strongly recommended that @devices contain all writable > > +# block device nodes that can have changed since the original > > +# @snapshot-save command execution. > > +# > > +# Returns: nothing > > +# > > +# Example: > > +# > > +# -> { "execute": "snapshot-load", > > +# "data": { > > +# "job-id": "snapload0", > > +# "tag": "my-snap", > > +# "vmstate": "disk0", > > +# "devices": ["disk0", "disk1"] > > +# } > > +# } > > +# <- { "return": { } } > > +# > > +# Since: 5.2 > > +## > > +{ 'command': 'snapshot-load', > > + 'data': { 'job-id': 'str', > > + 'tag': 'str', > > + 'vmstate': 'str', > > + 'devices': ['str'] } } > > + > > +## > > +# @snapshot-delete: > > +# > > +# Delete a VM snapshot > > +# > > +# @job-id: identifier for the newly created job > > +# @tag: name of the snapshot to delete. > > +# @devices: list of block device node names to delete a snapshot from > > +# > > +# Applications should not assume that the snapshot save is complete > > +# when this command returns. The job commands / events must be used > > +# to determine completion and to fetch details of any errors that arise. > > Do we have a query- command handy to easily learn which snapshot names > are even available to attempt deletion on? If not, that's worth a > separate patch. > > > +# > > +# Returns: nothing > > +# > > +# Example: > > +# > > +# -> { "execute": "snapshot-delete", > > +# "data": { > > +# "job-id": "snapdelete0", > > +# "tag": "my-snap", > > +# "devices": ["disk0", "disk1"] > > +# } > > +# } > > +# <- { "return": { } } > > +# > > +# Since: 5.2 > > +## > > > +++ b/tests/qemu-iotests/group > > @@ -291,6 +291,7 @@ > > 277 rw quick > > 279 rw backing quick > > 280 rw migration quick > > +310 rw quick > > 281 rw quick > > 282 rw img quick > > 283 auto quick > > What's wrong with sorted order? I get the renumbering to appease a merge > conflict, but it also requires rearrangement ;) > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org >
Didn't make 5.2. Pity. Try again for 6.0, please!