mbox series

[0/4,v2] Add support for running in VM guests to intel_idle

Message ID 20230605154716.840930-1-arjan@linux.intel.com
Headers show
Series Add support for running in VM guests to intel_idle | expand

Message

Arjan van de Ven June 5, 2023, 3:47 p.m. UTC
From: Arjan van de Ven <arjan@linux.intel.com>

intel_idle provides the CPU Idle states (for power saving in idle) to the
cpuidle framework, based on per-cpu tables combined with limited hardware
enumeration. This combination of cpuidle and intel_idle provides dynamic
behavior where power saving and performance impact are dynamically balanced
and where a set of generic knobs are provided in sysfs for users to tune
the heuristics (and get statistics etc)

However, intel_idle currently does not support running inside VM guests, and
the linux kernel falls back to either ACPI based idle (if supported by the
hypervisor/virtual bios) or just the default x86 fallback "hlt" based idle
method... that was introduced in the 1.2 kernel series... and lacks all the
dynamic behavior, user control and statistics that cpuidle brings.

While this is obviously functional, it's not great and we can do better
for the user by hooking up intel_idle into the cpuidle framework also
for the "in a guest" case.
And not only not great for the user, it's also not optimal and lacks two
key capabilities that are supported by the bare metal case:

1) The ability to flush the TLB for very long idle periods, to avoid
   a costly (and high latency) IPI wakeup later, of an idle vCPU when a
   process that used to run on the idle vCPU does an munmap or similar
   operation. Avoiding high latency IPIs helps avoid performance jitter.
2) The ability to use the new Intel C0.2 idle state instead of polling
   for very short duration idle periods to save power (and carbon footprint)

This patch series adds the basic support to run in a VM guest
to the intel_idle driver, and then addresses the first of these shortfalls.
The C0.2 gap will be fixed with a small additional patch after the
C0.2 support is merged seperately.


Arjan van de Ven (4):
  intel_idle: refactor state->enter manipulation into its own function
  intel_idle: clean up the (new) state_update_enter_method function
  intel_idle: Add support for using intel_idle in a VM guest using just
    hlt
  intel_idle: Add a "Long HLT" C1 state for the VM guest mode

 drivers/idle/intel_idle.c | 238 ++++++++++++++++++++++++++++++++++----
 1 file changed, 215 insertions(+), 23 deletions(-)