Message ID | 20240502213936.27619-1-zide.chen@intel.com |
---|---|
State | New |
Headers | show |
Series | [V5] KVM: selftests: Add a new option to rseq_test | expand |
On Thu, 02 May 2024 14:39:36 -0700, Zide Chen wrote: > Currently, the migration worker delays 1-10 us, assuming that one > KVM_RUN iteration only takes a few microseconds. But if the CPU low > power wakeup latency is large enough, for example, hundreds or even > thousands of microseconds deep C-state exit latencies on x86 server > CPUs, it may happen that it's not able to wakeup the target CPU before > the migration worker starts to migrate the vCPU thread to the next CPU. > > [...] Applied to kvm-x86 selftests, thanks! I tweaked the changelog slightly to call out the new comment and assert message. I also added an extra newline so that the "help" part of the assert message is isolated from the primary explanation of why the assert fired. E.g. the output looks like: ==== Test Assertion Failure ==== rseq_test.c:290: skip_sanity_check || i > (NR_TASK_MIGRATIONS * 2002) pid=20283 tid=20283 errno=4 - Interrupted system call 1 0x000000000040210a: main at rseq_test.c:286 2 0x00007f07fa821c86: ?? ??:0 3 0x0000000000402209: _start at ??:? Only performed 11162 KVM_RUNs, task stalled too much? Try disabling deep sleep states to reduce CPU wakeup latency, e.g. via cpuidle.off=1 or setting /dev/cpu_dma_latency to '0', or run with -u to disable this sanity check. [1/1] KVM: selftests: Add a new option to rseq_test https://github.com/kvm-x86/linux/commit/20ecf595b513 -- https://github.com/kvm-x86/linux/tree/next
diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c index 28f97fb52044..ad418a5c59dd 100644 --- a/tools/testing/selftests/kvm/rseq_test.c +++ b/tools/testing/selftests/kvm/rseq_test.c @@ -186,12 +186,35 @@ static void calc_min_max_cpu(void) "Only one usable CPU, task migration not possible"); } +static void help(const char *name) +{ + puts(""); + printf("usage: %s [-h] [-u]\n", name); + printf(" -u: Don't sanity check the number of successful KVM_RUNs\n"); + puts(""); + exit(0); +} + int main(int argc, char *argv[]) { int r, i, snapshot; struct kvm_vm *vm; struct kvm_vcpu *vcpu; u32 cpu, rseq_cpu; + bool skip_sanity_check = false; + int opt; + + while ((opt = getopt(argc, argv, "hu")) != -1) { + switch (opt) { + case 'u': + skip_sanity_check = true; + break; + case 'h': + default: + help(argv[0]); + break; + } + } r = sched_getaffinity(0, sizeof(possible_mask), &possible_mask); TEST_ASSERT(!r, "sched_getaffinity failed, errno = %d (%s)", errno, @@ -254,9 +277,17 @@ int main(int argc, char *argv[]) * getcpu() to stabilize. A 2:1 migration:KVM_RUN ratio is a fairly * conservative ratio on x86-64, which can do _more_ KVM_RUNs than * migrations given the 1us+ delay in the migration task. + * + * Another reason why it may have small migration:KVM_RUN ratio is that, + * on systems with large low power mode wakeup latency, it may happen + * quite often that the scheduler is not able to wake up the target CPU + * before the vCPU thread is scheduled to another CPU. */ - TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2), - "Only performed %d KVM_RUNs, task stalled too much?", i); + TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2), + "Only performed %d KVM_RUNs, task stalled too much? \n" + " Try disabling deep sleep states to reduce CPU wakeup latency,\n" + " e.g. via cpuidle.off=1 or setting /dev/cpu_dma_latency to '0',\n" + " or run with -u to disable this sanity check.", i); pthread_join(migration_thread, NULL);