Message ID | 20230316164257.42590-1-roger.pau@citrix.com |
---|---|
State | Superseded |
Headers | show |
Series | [v4] acpi/processor: fix evaluating _PDC method when running as Xen dom0 | expand |
On 16.03.23 17:42, Roger Pau Monne wrote: > In ACPI systems, the OS can direct power management, as opposed to the > firmware. This OS-directed Power Management is called OSPM. Part of > telling the firmware that the OS going to direct power management is > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > methods must be evaluated for every processor object. If these _PDC > calls are not completed for every processor it can lead to > inconsistency and later failures in things like the CPU frequency > driver. > > In a Xen system, the dom0 kernel is responsible for system-wide power > management. The dom0 kernel is in charge of OSPM. However, the > number of CPUs available to dom0 can be different than the number of > CPUs physically present on the system. > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for > all the processors, but it can't always see them. > > In dom0 kernels, ignore the existing ACPI method for determining if a > processor is physically present because it might not be accurate. > Instead, ask the hypervisor for this information. > > Fix this by introducing a custom function to use when running as Xen > dom0 in order to check whether a processor object matches a CPU that's > online. Such checking is done using the existing information fetched > by the Xen pCPU subsystem, extending it to also store the ACPI ID. > > This ensures that _PDC method gets evaluated for all physically online > CPUs, regardless of the number of CPUs made available to dom0. > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Juergen
On Fri, Mar 17, 2023 at 1:42 PM Juergen Gross <jgross@suse.com> wrote: > > On 16.03.23 17:42, Roger Pau Monne wrote: > > In ACPI systems, the OS can direct power management, as opposed to the > > firmware. This OS-directed Power Management is called OSPM. Part of > > telling the firmware that the OS going to direct power management is > > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > > methods must be evaluated for every processor object. If these _PDC > > calls are not completed for every processor it can lead to > > inconsistency and later failures in things like the CPU frequency > > driver. > > > > In a Xen system, the dom0 kernel is responsible for system-wide power > > management. The dom0 kernel is in charge of OSPM. However, the > > number of CPUs available to dom0 can be different than the number of > > CPUs physically present on the system. > > > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for > > all the processors, but it can't always see them. > > > > In dom0 kernels, ignore the existing ACPI method for determining if a > > processor is physically present because it might not be accurate. > > Instead, ask the hypervisor for this information. > > > > Fix this by introducing a custom function to use when running as Xen > > dom0 in order to check whether a processor object matches a CPU that's > > online. Such checking is done using the existing information fetched > > by the Xen pCPU subsystem, extending it to also store the ACPI ID. > > > > This ensures that _PDC method gets evaluated for all physically online > > CPUs, regardless of the number of CPUs made available to dom0. > > > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Juergen Gross <jgross@suse.com> Applied as 6.4 material under a slightly edited subject, thanks!
On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote: > > In ACPI systems, the OS can direct power management, as opposed to the > firmware. This OS-directed Power Management is called OSPM. Part of > telling the firmware that the OS going to direct power management is > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > methods must be evaluated for every processor object. If these _PDC > calls are not completed for every processor it can lead to > inconsistency and later failures in things like the CPU frequency > driver. > > In a Xen system, the dom0 kernel is responsible for system-wide power > management. The dom0 kernel is in charge of OSPM. However, the > number of CPUs available to dom0 can be different than the number of > CPUs physically present on the system. > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for > all the processors, but it can't always see them. > > In dom0 kernels, ignore the existing ACPI method for determining if a > processor is physically present because it might not be accurate. > Instead, ask the hypervisor for this information. > > Fix this by introducing a custom function to use when running as Xen > dom0 in order to check whether a processor object matches a CPU that's > online. Such checking is done using the existing information fetched > by the Xen pCPU subsystem, extending it to also store the ACPI ID. > > This ensures that _PDC method gets evaluated for all physically online > CPUs, regardless of the number of CPUs made available to dom0. > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > Changes since v3: > - Protect xen_processor_present() definition with CONFIG_ACPI. > > Changes since v2: > - Extend and use the existing pcpu functionality. > > Changes since v1: > - Reword commit message. > --- > arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++ > drivers/acpi/processor_pdc.c | 11 +++++++++++ > drivers/xen/pcpu.c | 21 +++++++++++++++++++++ > 3 files changed, 42 insertions(+) > > diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h > index 5fc35f889cd1..990a1609677e 100644 > --- a/arch/x86/include/asm/xen/hypervisor.h > +++ b/arch/x86/include/asm/xen/hypervisor.h > @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params); > void __init mem_map_via_hcall(struct boot_params *boot_params_p); > #endif > > +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI) > +bool __init xen_processor_present(uint32_t acpi_id); > +#else > +static inline bool xen_processor_present(uint32_t acpi_id) > +{ > + BUG(); > + return false; > +} > +#endif > + > #endif /* _ASM_X86_XEN_HYPERVISOR_H */ > diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c > index 8c3f82c9fff3..18fb04523f93 100644 > --- a/drivers/acpi/processor_pdc.c > +++ b/drivers/acpi/processor_pdc.c > @@ -14,6 +14,8 @@ > #include <linux/acpi.h> > #include <acpi/processor.h> > > +#include <xen/xen.h> This along with the definition above is evidently insufficient for xen_processor_present() to always be defined. See https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u for example. I'm dropping the patch now, please fix and resend. > + > #include "internal.h" > > static bool __init processor_physically_present(acpi_handle handle) > @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle) > return false; > } > > + if (xen_initial_domain()) > + /* > + * When running as a Xen dom0 the number of processors Linux > + * sees can be different from the real number of processors on > + * the system, and we still need to execute _PDC for all of > + * them. > + */ > + return xen_processor_present(acpi_id); > + > type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0; > cpuid = acpi_get_cpuid(handle, type, acpi_id); > > diff --git a/drivers/xen/pcpu.c b/drivers/xen/pcpu.c > index fd3a644b0855..034d05e56507 100644 > --- a/drivers/xen/pcpu.c > +++ b/drivers/xen/pcpu.c > @@ -58,6 +58,7 @@ struct pcpu { > struct list_head list; > struct device dev; > uint32_t cpu_id; > + uint32_t acpi_id; > uint32_t flags; > }; > > @@ -249,6 +250,7 @@ static struct pcpu *create_and_register_pcpu(struct xenpf_pcpuinfo *info) > > INIT_LIST_HEAD(&pcpu->list); > pcpu->cpu_id = info->xen_cpuid; > + pcpu->acpi_id = info->acpi_id; > pcpu->flags = info->flags; > > /* Need hold on xen_pcpu_lock before pcpu list manipulations */ > @@ -381,3 +383,22 @@ static int __init xen_pcpu_init(void) > return ret; > } > arch_initcall(xen_pcpu_init); > + > +#ifdef CONFIG_ACPI > +bool __init xen_processor_present(uint32_t acpi_id) > +{ > + struct pcpu *pcpu; > + bool online = false; > + > + mutex_lock(&xen_pcpu_lock); > + list_for_each_entry(pcpu, &xen_pcpus, list) > + if (pcpu->acpi_id == acpi_id) { > + online = pcpu->flags & XEN_PCPU_FLAGS_ONLINE; > + break; > + } > + > + mutex_unlock(&xen_pcpu_lock); > + > + return online; > +} > +#endif > -- > 2.39.0 >
On Tue, Mar 21, 2023 at 02:47:46PM +0100, Rafael J. Wysocki wrote: > On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote: > > > > In ACPI systems, the OS can direct power management, as opposed to the > > firmware. This OS-directed Power Management is called OSPM. Part of > > telling the firmware that the OS going to direct power management is > > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > > methods must be evaluated for every processor object. If these _PDC > > calls are not completed for every processor it can lead to > > inconsistency and later failures in things like the CPU frequency > > driver. > > > > In a Xen system, the dom0 kernel is responsible for system-wide power > > management. The dom0 kernel is in charge of OSPM. However, the > > number of CPUs available to dom0 can be different than the number of > > CPUs physically present on the system. > > > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for > > all the processors, but it can't always see them. > > > > In dom0 kernels, ignore the existing ACPI method for determining if a > > processor is physically present because it might not be accurate. > > Instead, ask the hypervisor for this information. > > > > Fix this by introducing a custom function to use when running as Xen > > dom0 in order to check whether a processor object matches a CPU that's > > online. Such checking is done using the existing information fetched > > by the Xen pCPU subsystem, extending it to also store the ACPI ID. > > > > This ensures that _PDC method gets evaluated for all physically online > > CPUs, regardless of the number of CPUs made available to dom0. > > > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > Changes since v3: > > - Protect xen_processor_present() definition with CONFIG_ACPI. > > > > Changes since v2: > > - Extend and use the existing pcpu functionality. > > > > Changes since v1: > > - Reword commit message. > > --- > > arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++ > > drivers/acpi/processor_pdc.c | 11 +++++++++++ > > drivers/xen/pcpu.c | 21 +++++++++++++++++++++ > > 3 files changed, 42 insertions(+) > > > > diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h > > index 5fc35f889cd1..990a1609677e 100644 > > --- a/arch/x86/include/asm/xen/hypervisor.h > > +++ b/arch/x86/include/asm/xen/hypervisor.h > > @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params); > > void __init mem_map_via_hcall(struct boot_params *boot_params_p); > > #endif > > > > +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI) > > +bool __init xen_processor_present(uint32_t acpi_id); > > +#else > > +static inline bool xen_processor_present(uint32_t acpi_id) > > +{ > > + BUG(); > > + return false; > > +} > > +#endif > > + > > #endif /* _ASM_X86_XEN_HYPERVISOR_H */ > > diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c > > index 8c3f82c9fff3..18fb04523f93 100644 > > --- a/drivers/acpi/processor_pdc.c > > +++ b/drivers/acpi/processor_pdc.c > > @@ -14,6 +14,8 @@ > > #include <linux/acpi.h> > > #include <acpi/processor.h> > > > > +#include <xen/xen.h> > > This along with the definition above is evidently insufficient for > xen_processor_present() to always be defined. See > https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u > for example. > > I'm dropping the patch now, please fix and resend. Hello, Sorry. I've sent a followup fix: https://lore.kernel.org/xen-devel/20230321112522.46806-1-roger.pau@citrix.com/T/#u Would you be fine with taking such followup, or would rather prefer for me to send the original fixed patch as v5? Thanks, Roger.
On Tue, Mar 21, 2023 at 3:02 PM Roger Pau Monné <roger.pau@citrix.com> wrote: > > On Tue, Mar 21, 2023 at 02:47:46PM +0100, Rafael J. Wysocki wrote: > > On Thu, Mar 16, 2023 at 5:43 PM Roger Pau Monne <roger.pau@citrix.com> wrote: > > > > > > In ACPI systems, the OS can direct power management, as opposed to the > > > firmware. This OS-directed Power Management is called OSPM. Part of > > > telling the firmware that the OS going to direct power management is > > > making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC > > > methods must be evaluated for every processor object. If these _PDC > > > calls are not completed for every processor it can lead to > > > inconsistency and later failures in things like the CPU frequency > > > driver. > > > > > > In a Xen system, the dom0 kernel is responsible for system-wide power > > > management. The dom0 kernel is in charge of OSPM. However, the > > > number of CPUs available to dom0 can be different than the number of > > > CPUs physically present on the system. > > > > > > This leads to a problem: the dom0 kernel needs to evaluate _PDC for > > > all the processors, but it can't always see them. > > > > > > In dom0 kernels, ignore the existing ACPI method for determining if a > > > processor is physically present because it might not be accurate. > > > Instead, ask the hypervisor for this information. > > > > > > Fix this by introducing a custom function to use when running as Xen > > > dom0 in order to check whether a processor object matches a CPU that's > > > online. Such checking is done using the existing information fetched > > > by the Xen pCPU subsystem, extending it to also store the ACPI ID. > > > > > > This ensures that _PDC method gets evaluated for all physically online > > > CPUs, regardless of the number of CPUs made available to dom0. > > > > > > Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > > --- > > > Changes since v3: > > > - Protect xen_processor_present() definition with CONFIG_ACPI. > > > > > > Changes since v2: > > > - Extend and use the existing pcpu functionality. > > > > > > Changes since v1: > > > - Reword commit message. > > > --- > > > arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++ > > > drivers/acpi/processor_pdc.c | 11 +++++++++++ > > > drivers/xen/pcpu.c | 21 +++++++++++++++++++++ > > > 3 files changed, 42 insertions(+) > > > > > > diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h > > > index 5fc35f889cd1..990a1609677e 100644 > > > --- a/arch/x86/include/asm/xen/hypervisor.h > > > +++ b/arch/x86/include/asm/xen/hypervisor.h > > > @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params); > > > void __init mem_map_via_hcall(struct boot_params *boot_params_p); > > > #endif > > > > > > +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI) > > > +bool __init xen_processor_present(uint32_t acpi_id); > > > +#else > > > +static inline bool xen_processor_present(uint32_t acpi_id) > > > +{ > > > + BUG(); > > > + return false; > > > +} > > > +#endif > > > + > > > #endif /* _ASM_X86_XEN_HYPERVISOR_H */ > > > diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c > > > index 8c3f82c9fff3..18fb04523f93 100644 > > > --- a/drivers/acpi/processor_pdc.c > > > +++ b/drivers/acpi/processor_pdc.c > > > @@ -14,6 +14,8 @@ > > > #include <linux/acpi.h> > > > #include <acpi/processor.h> > > > > > > +#include <xen/xen.h> > > > > This along with the definition above is evidently insufficient for > > xen_processor_present() to always be defined. See > > https://lore.kernel.org/linux-acpi/64198b60.bO+m9o5w+Hd8hcF3%25lkp@intel.com/T/#u > > for example. > > > > I'm dropping the patch now, please fix and resend. > > Hello, > > Sorry. I've sent a followup fix: > > https://lore.kernel.org/xen-devel/20230321112522.46806-1-roger.pau@citrix.com/T/#u > > Would you be fine with taking such followup, or would rather prefer > for me to send the original fixed patch as v5? Please fold the fix into the original patch and resend.
diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h index 5fc35f889cd1..990a1609677e 100644 --- a/arch/x86/include/asm/xen/hypervisor.h +++ b/arch/x86/include/asm/xen/hypervisor.h @@ -63,4 +63,14 @@ void __init xen_pvh_init(struct boot_params *boot_params); void __init mem_map_via_hcall(struct boot_params *boot_params_p); #endif +#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI) +bool __init xen_processor_present(uint32_t acpi_id); +#else +static inline bool xen_processor_present(uint32_t acpi_id) +{ + BUG(); + return false; +} +#endif + #endif /* _ASM_X86_XEN_HYPERVISOR_H */ diff --git a/drivers/acpi/processor_pdc.c b/drivers/acpi/processor_pdc.c index 8c3f82c9fff3..18fb04523f93 100644 --- a/drivers/acpi/processor_pdc.c +++ b/drivers/acpi/processor_pdc.c @@ -14,6 +14,8 @@ #include <linux/acpi.h> #include <acpi/processor.h> +#include <xen/xen.h> + #include "internal.h" static bool __init processor_physically_present(acpi_handle handle) @@ -47,6 +49,15 @@ static bool __init processor_physically_present(acpi_handle handle) return false; } + if (xen_initial_domain()) + /* + * When running as a Xen dom0 the number of processors Linux + * sees can be different from the real number of processors on + * the system, and we still need to execute _PDC for all of + * them. + */ + return xen_processor_present(acpi_id); + type = (acpi_type == ACPI_TYPE_DEVICE) ? 1 : 0; cpuid = acpi_get_cpuid(handle, type, acpi_id); diff --git a/drivers/xen/pcpu.c b/drivers/xen/pcpu.c index fd3a644b0855..034d05e56507 100644 --- a/drivers/xen/pcpu.c +++ b/drivers/xen/pcpu.c @@ -58,6 +58,7 @@ struct pcpu { struct list_head list; struct device dev; uint32_t cpu_id; + uint32_t acpi_id; uint32_t flags; }; @@ -249,6 +250,7 @@ static struct pcpu *create_and_register_pcpu(struct xenpf_pcpuinfo *info) INIT_LIST_HEAD(&pcpu->list); pcpu->cpu_id = info->xen_cpuid; + pcpu->acpi_id = info->acpi_id; pcpu->flags = info->flags; /* Need hold on xen_pcpu_lock before pcpu list manipulations */ @@ -381,3 +383,22 @@ static int __init xen_pcpu_init(void) return ret; } arch_initcall(xen_pcpu_init); + +#ifdef CONFIG_ACPI +bool __init xen_processor_present(uint32_t acpi_id) +{ + struct pcpu *pcpu; + bool online = false; + + mutex_lock(&xen_pcpu_lock); + list_for_each_entry(pcpu, &xen_pcpus, list) + if (pcpu->acpi_id == acpi_id) { + online = pcpu->flags & XEN_PCPU_FLAGS_ONLINE; + break; + } + + mutex_unlock(&xen_pcpu_lock); + + return online; +} +#endif
In ACPI systems, the OS can direct power management, as opposed to the firmware. This OS-directed Power Management is called OSPM. Part of telling the firmware that the OS going to direct power management is making ACPI "_PDC" (Processor Driver Capabilities) calls. These _PDC methods must be evaluated for every processor object. If these _PDC calls are not completed for every processor it can lead to inconsistency and later failures in things like the CPU frequency driver. In a Xen system, the dom0 kernel is responsible for system-wide power management. The dom0 kernel is in charge of OSPM. However, the number of CPUs available to dom0 can be different than the number of CPUs physically present on the system. This leads to a problem: the dom0 kernel needs to evaluate _PDC for all the processors, but it can't always see them. In dom0 kernels, ignore the existing ACPI method for determining if a processor is physically present because it might not be accurate. Instead, ask the hypervisor for this information. Fix this by introducing a custom function to use when running as Xen dom0 in order to check whether a processor object matches a CPU that's online. Such checking is done using the existing information fetched by the Xen pCPU subsystem, extending it to also store the ACPI ID. This ensures that _PDC method gets evaluated for all physically online CPUs, regardless of the number of CPUs made available to dom0. Fixes: 5d554a7bb064 ('ACPI: processor: add internal processor_physically_present()') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v3: - Protect xen_processor_present() definition with CONFIG_ACPI. Changes since v2: - Extend and use the existing pcpu functionality. Changes since v1: - Reword commit message. --- arch/x86/include/asm/xen/hypervisor.h | 10 ++++++++++ drivers/acpi/processor_pdc.c | 11 +++++++++++ drivers/xen/pcpu.c | 21 +++++++++++++++++++++ 3 files changed, 42 insertions(+)