Message ID | 20200925172604.2142227-7-pbonzini@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | Fix scsi devices plug/unplug races w.r.t virtio-scsi iothread | expand |
On Fri, 2020-09-25 at 13:26 -0400, Paolo Bonzini wrote: > From: Maxim Levitsky <mlevitsk@redhat.com> > > Some code might race with placement of new devices on a bus. > We currently first place a (unrealized) device on the bus > and then realize it. > > As a workaround, users that scan the child device list, can > check the realized property to see if it is safe to access such a device. > Use an atomic write here too to aid with this. > > A separate discussion is what to do with devices that are unrealized: > It looks like for this case we only call the hotplug handler's unplug > callback and its up to it to unrealize the device. > An atomic operation doesn't cause harm for this code path though. > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> > Message-Id: <20200913160259.32145-6-mlevitsk@redhat.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > hw/core/qdev.c | 19 ++++++++++++++++++- > include/hw/qdev-core.h | 3 +++ > 2 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/hw/core/qdev.c b/hw/core/qdev.c > index 59e5e710b7..fc4daa36fa 100644 > --- a/hw/core/qdev.c > +++ b/hw/core/qdev.c > @@ -946,7 +946,25 @@ static void device_set_realized(Object *obj, bool value, Error **errp) > } > } > > + qatomic_store_release(&dev->realized, value); > + > } else if (!value && dev->realized) { > + > + /* > + * Change the value so that any concurrent users are aware > + * that the device is going to be unrealized > + * > + * TODO: change .realized property to enum that states > + * each phase of the device realization/unrealization > + */ > + > + qatomic_set(&dev->realized, value); > + /* > + * Ensure that concurrent users see this update prior to > + * any other changes done by unrealize. > + */ > + smp_wmb(); I''l probably never fully understand where to use read/write/full barrier. If I understand corrctly, read barrier prevents reads done by this thread to be reordered, by the CPU and write barrier prevents writes done by this CPU to be re-ordered. Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering stuff...) > + > QLIST_FOREACH(bus, &dev->child_bus, sibling) { > qbus_unrealize(bus); > } > @@ -961,7 +979,6 @@ static void device_set_realized(Object *obj, bool value, Error **errp) > } > > assert(local_err == NULL); > - dev->realized = value; > return; > > child_realize_fail: > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h > index 8067497074..39490e76ee 100644 > --- a/include/hw/qdev-core.h > +++ b/include/hw/qdev-core.h > @@ -163,6 +163,9 @@ struct NamedClockList { > /** > * DeviceState: > * @realized: Indicates whether the device has been fully constructed. > + * When accessed without the iothread mutex, consider using > + * qatomic_load_acquire() before accessing any other field in > + * the device. This sounds way better that what I wrote. As you probably noticed recently (but not this patchset yet) I started to review the spelling/comments more throughfully, so hopefully we see less of stuff that I myself read and wonder why I did that many spelling/grammar mistakes.... :-) > * @reset: ResettableState for the device; handled by Resettable interface. > * > * This structure should not be accessed directly. We declare it here Best regards, Maxim Levitsky
On 30/09/20 16:31, Maxim Levitsky wrote: >> + >> + qatomic_set(&dev->realized, value); >> + /* >> + * Ensure that concurrent users see this update prior to >> + * any other changes done by unrealize. >> + */ >> + smp_wmb(); > > I''l probably never fully understand where to use read/write/full barrier. > If I understand corrctly, read barrier prevents reads done by this thread to be reordered, > by the CPU and write barrier prevents writes done by this CPU to be re-ordered. I must say that the above is not really satisfactory. The right thing to do would be to say which changes are done by unrealize; then you should make sure that *after* reading something that unrealize could undo you check if dev->realized is still true. scsi_device_find is one such case, but I'm not convinced it is enough. Paolo > Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering > stuff...)
On Wed, 2020-09-30 at 19:44 +0200, Paolo Bonzini wrote: > On 30/09/20 16:31, Maxim Levitsky wrote: > > > + > > > + qatomic_set(&dev->realized, value); > > > + /* > > > + * Ensure that concurrent users see this update prior to > > > + * any other changes done by unrealize. > > > + */ > > > + smp_wmb(); > > > > I''l probably never fully understand where to use read/write/full barrier. > > If I understand corrctly, read barrier prevents reads done by this thread to be reordered, > > by the CPU and write barrier prevents writes done by this CPU to be re-ordered. > > I must say that the above is not really satisfactory. The right thing > to do would be to say which changes are done by unrealize; then you > should make sure that *after* reading something that unrealize could > undo you check if dev->realized is still true. I didn't fully understand this to be honest. I just wanted to explain what I know and what I don't know about hardware barriers. I know that read barriers should be paired with write barriers, like if one CPU has a write barrier, which ensures the orders of writes to two memory locations, the other CPU can then use a read barrier to ensure that it sees those writes in this order. I thus think that reads of dev->realized should be paired with read barrier, but here a full memory barrier isn't really needed. Best regards, Maxim Levitsky > > scsi_device_find is one such case, but I'm not convinced it is enough. > > Paolo > > > Both (depending on the macro) usually imply compiler barrier (to avoid compilier re-ordering > > stuff...)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 59e5e710b7..fc4daa36fa 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -946,7 +946,25 @@ static void device_set_realized(Object *obj, bool value, Error **errp) } } + qatomic_store_release(&dev->realized, value); + } else if (!value && dev->realized) { + + /* + * Change the value so that any concurrent users are aware + * that the device is going to be unrealized + * + * TODO: change .realized property to enum that states + * each phase of the device realization/unrealization + */ + + qatomic_set(&dev->realized, value); + /* + * Ensure that concurrent users see this update prior to + * any other changes done by unrealize. + */ + smp_wmb(); + QLIST_FOREACH(bus, &dev->child_bus, sibling) { qbus_unrealize(bus); } @@ -961,7 +979,6 @@ static void device_set_realized(Object *obj, bool value, Error **errp) } assert(local_err == NULL); - dev->realized = value; return; child_realize_fail: diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 8067497074..39490e76ee 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -163,6 +163,9 @@ struct NamedClockList { /** * DeviceState: * @realized: Indicates whether the device has been fully constructed. + * When accessed without the iothread mutex, consider using + * qatomic_load_acquire() before accessing any other field in + * the device. * @reset: ResettableState for the device; handled by Resettable interface. * * This structure should not be accessed directly. We declare it here