Message ID | YerAW90QwEg9yXAb@fuller.cnet |
---|---|
State | New |
Headers | show |
Series | rt-numa: optionally ignore runtime cpumask | expand |
On 2022-01-21 11:16:59 [-0300], Marcelo Tosatti wrote: > > use_current_cpuset() function does: > > /* > * After this function is called, affinity_mask is the intersection of > * the user supplied affinity mask and the affinity mask from the run > * time environment > */ > static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) > > However, when using isolcpus kernel command line option, the CPUs > specificied at isolcpus= are not part of the run time environment > cpumask. > > This causes "cyclictest -a isolatedcpus" to fail with: > > WARN: Couldn't setaffinity in main thread: Invalid argument > FATAL: No allowable cpus to run on > # /dev/cpu_dma_latency set to 0us > > To fix this, add an environment variable IGNORE_RUNTIME_CPU_AFFINITY_MASK > that when set to a value other than 0, will override the runtime cpu > affinity mask (retrieved with numa_sched_getaffinity) with a bit set > for each CPU in numa_num_configured_cpus: This looks hacky and not documented. What about using all CPUs which part of current affinity mask by default. And then either specify the requested CPU mask or use explicitly all CPUs. Sebastian
On Fri, Jan 21, 2022 at 07:16:48PM +0100, Sebastian Andrzej Siewior wrote: > On 2022-01-21 11:16:59 [-0300], Marcelo Tosatti wrote: > > > > use_current_cpuset() function does: > > > > /* > > * After this function is called, affinity_mask is the intersection of > > * the user supplied affinity mask and the affinity mask from the run > > * time environment > > */ > > static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) > > > > However, when using isolcpus kernel command line option, the CPUs > > specificied at isolcpus= are not part of the run time environment > > cpumask. > > > > This causes "cyclictest -a isolatedcpus" to fail with: > > > > WARN: Couldn't setaffinity in main thread: Invalid argument > > FATAL: No allowable cpus to run on > > # /dev/cpu_dma_latency set to 0us > > > > To fix this, add an environment variable IGNORE_RUNTIME_CPU_AFFINITY_MASK > > that when set to a value other than 0, will override the runtime cpu > > affinity mask (retrieved with numa_sched_getaffinity) with a bit set > > for each CPU in numa_num_configured_cpus: > > This looks hacky and not documented. What about using all CPUs which > part of current affinity mask by default. You mean "using all CPUs which are part of the current affinity mask by default" ? (where current affinity mask would mean user specified CPU mask). > And then either specify the > requested CPU mask or use explicitly all CPUs. Do you mean to drop /* * Clear bits that are not set in both the cpuset from the * environment, and in the user specified affinity. And just attempt to use the user specified mask? (which will then return failure to the user in which case he can correct it).
On 2022-01-24 09:58:31 [-0300], Marcelo Tosatti wrote: > You mean "using all CPUs which are part of the current affinity mask by > default" ? (where current affinity mask would mean user specified CPU > mask). > > > And then either specify the > > requested CPU mask or use explicitly all CPUs. > > Do you mean to drop > > /* > * Clear bits that are not set in both the cpuset from the > * environment, and in the user specified affinity. > > And just attempt to use the user specified mask? (which will then return > failure to the user in which case he can correct it). > After reading it again, I don't get it. cyclictest -a Uses all CPUs in the system. cyclictest -a $CPU Uses the $CPU (mask) specified. If $CPU is not part of the current CPU mask, why shouldn't it work? Sebastian
On Mon, Jan 24, 2022 at 05:26:26PM +0100, Sebastian Andrzej Siewior wrote: > On 2022-01-24 09:58:31 [-0300], Marcelo Tosatti wrote: > > You mean "using all CPUs which are part of the current affinity mask by > > default" ? (where current affinity mask would mean user specified CPU > > mask). > > > > > And then either specify the > > > requested CPU mask or use explicitly all CPUs. > > > > Do you mean to drop > > > > /* > > * Clear bits that are not set in both the cpuset from the > > * environment, and in the user specified affinity. > > > > And just attempt to use the user specified mask? (which will then return > > failure to the user in which case he can correct it). > > > After reading it again, I don't get it. > cyclictest -a > > Uses all CPUs in the system. > > cyclictest -a $CPU > > Uses the $CPU (mask) specified. If $CPU is not part of the current CPU > mask, why shouldn't it work? -a, --affinity[=PROC-SET] Run threads on the set of processors given by PROC-SET. If PROC-SET is not specified, all processors will be used. Threads will be assigned to processors in the set in numeric order, in a round-robin fashion. The set of processors can be specified as A,B,C, or A-C, or A-B,D-F, and so on*. The ! character can be used to negate a set. For example, !B-D means to use all available CPUs except B through D. The cpu numbers are the same as shown in the processor field in /proc/cpuinfo. See numa(3) for more information on specifying CPU sets. * Support for CPU sets requires libnuma version >= 2. For libnuma v1, PROC-SET, if specified, must be a single CPU number. /* * After this function is called, affinity_mask is the intersection of * the user supplied affinity mask and the affinity mask from the run * time environment */ static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) { struct bitmask *curmask; int i; curmask = numa_allocate_cpumask(); numa_sched_getaffinity(getpid(), curmask); /* * Clear bits that are not set in both the cpuset from the * environment, and in the user specified affinity. */ for (i = 0; i < max_cpus; i++) { if ((!numa_bitmask_isbitset(cpumask, i)) || (!numa_bitmask_isbitset(curmask, i))) numa_bitmask_clearbit(cpumask, i); } numa_bitmask_free(curmask); } Consider 8 CPU system booted with isolcpus=3-7, and execution of "cyclictest -a 3-7". sched_getaffinity() returns mask with bits set for CPUs 0 and 1. The user supplied mask has bits 3-7 set. The intersection between the user supplied mask and the affinity mask from the run time environment has no bits set.
On Mon, Jan 24, 2022 at 01:40:49PM -0300, Marcelo Tosatti wrote: > On Mon, Jan 24, 2022 at 05:26:26PM +0100, Sebastian Andrzej Siewior wrote: > > On 2022-01-24 09:58:31 [-0300], Marcelo Tosatti wrote: > > > You mean "using all CPUs which are part of the current affinity mask by > > > default" ? (where current affinity mask would mean user specified CPU > > > mask). > > > > > > > And then either specify the > > > > requested CPU mask or use explicitly all CPUs. > > > > > > Do you mean to drop > > > > > > /* > > > * Clear bits that are not set in both the cpuset from the > > > * environment, and in the user specified affinity. > > > > > > And just attempt to use the user specified mask? (which will then return > > > failure to the user in which case he can correct it). > > > > > After reading it again, I don't get it. > > cyclictest -a > > > > Uses all CPUs in the system. > > > > cyclictest -a $CPU > > > > Uses the $CPU (mask) specified. If $CPU is not part of the current CPU > > mask, why shouldn't it work? > > -a, --affinity[=PROC-SET] > Run threads on the set of processors given by PROC-SET. If PROC-SET is not specified, all processors will be used. Threads will be assigned to processors > in the set in numeric order, in a round-robin fashion. > The set of processors can be specified as A,B,C, or A-C, or A-B,D-F, and so on*. The ! character can be used to negate a set. For example, !B-D means to > use all available CPUs except B through D. The cpu numbers are the same as shown in the processor field in /proc/cpuinfo. See numa(3) for more > information on specifying CPU sets. * Support for CPU sets requires libnuma version >= 2. For libnuma v1, PROC-SET, if specified, must be a single CPU > number. > > > /* > * After this function is called, affinity_mask is the intersection of > * the user supplied affinity mask and the affinity mask from the run > * time environment > */ > static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) > { > struct bitmask *curmask; > int i; > > curmask = numa_allocate_cpumask(); > numa_sched_getaffinity(getpid(), curmask); > > /* > * Clear bits that are not set in both the cpuset from the > * environment, and in the user specified affinity. > */ > for (i = 0; i < max_cpus; i++) { > if ((!numa_bitmask_isbitset(cpumask, i)) || > (!numa_bitmask_isbitset(curmask, i))) > numa_bitmask_clearbit(cpumask, i); > } > > numa_bitmask_free(curmask); > } > > Consider 8 CPU system booted with isolcpus=3-7, and execution of > "cyclictest -a 3-7". > > sched_getaffinity() returns mask with bits set for CPUs 0 and 1, and 2. > The user supplied mask has bits 3-7 set. > > The intersection between the user supplied mask and the affinity mask > from the run time environment has no bits set. > > >
On 2022-01-24 13:40:49 [-0300], Marcelo Tosatti wrote: > > Uses the $CPU (mask) specified. If $CPU is not part of the current CPU > > mask, why shouldn't it work? > > -a, --affinity[=PROC-SET] > Run threads on the set of processors given by PROC-SET. If PROC-SET is not specified, all processors will be used. Threads will be assigned to processors > in the set in numeric order, in a round-robin fashion. > The set of processors can be specified as A,B,C, or A-C, or A-B,D-F, and so on*. The ! character can be used to negate a set. For example, !B-D means to > use all available CPUs except B through D. The cpu numbers are the same as shown in the processor field in /proc/cpuinfo. See numa(3) for more > information on specifying CPU sets. * Support for CPU sets requires libnuma version >= 2. For libnuma v1, PROC-SET, if specified, must be a single CPU > number. > > > /* > * After this function is called, affinity_mask is the intersection of > * the user supplied affinity mask and the affinity mask from the run > * time environment > */ > static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) > { > struct bitmask *curmask; > int i; > > curmask = numa_allocate_cpumask(); > numa_sched_getaffinity(getpid(), curmask); > > /* > * Clear bits that are not set in both the cpuset from the > * environment, and in the user specified affinity. > */ > for (i = 0; i < max_cpus; i++) { > if ((!numa_bitmask_isbitset(cpumask, i)) || > (!numa_bitmask_isbitset(curmask, i))) > numa_bitmask_clearbit(cpumask, i); > } > > numa_bitmask_free(curmask); > } > > Consider 8 CPU system booted with isolcpus=3-7, and execution of > "cyclictest -a 3-7". > > sched_getaffinity() returns mask with bits set for CPUs 0 and 1. > The user supplied mask has bits 3-7 set. > > The intersection between the user supplied mask and the affinity mask > from the run time environment has no bits set. Okay. But does this make to keep? I understand that the current CPU-mask needs to be kept for masks like !B or !B-D. But is there a need to use the current CPU-mask when a specific mask has been specified by the user? Sebastian
On Mon, Jan 24, 2022 at 06:07:37PM +0100, Sebastian Andrzej Siewior wrote: > On 2022-01-24 13:40:49 [-0300], Marcelo Tosatti wrote: > > > Uses the $CPU (mask) specified. If $CPU is not part of the current CPU > > > mask, why shouldn't it work? > > > > -a, --affinity[=PROC-SET] > > Run threads on the set of processors given by PROC-SET. If PROC-SET is not specified, all processors will be used. Threads will be assigned to processors > > in the set in numeric order, in a round-robin fashion. > > The set of processors can be specified as A,B,C, or A-C, or A-B,D-F, and so on*. The ! character can be used to negate a set. For example, !B-D means to > > use all available CPUs except B through D. The cpu numbers are the same as shown in the processor field in /proc/cpuinfo. See numa(3) for more > > information on specifying CPU sets. * Support for CPU sets requires libnuma version >= 2. For libnuma v1, PROC-SET, if specified, must be a single CPU > > number. > > > > > > /* > > * After this function is called, affinity_mask is the intersection of > > * the user supplied affinity mask and the affinity mask from the run > > * time environment > > */ > > static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) > > { > > struct bitmask *curmask; > > int i; > > > > curmask = numa_allocate_cpumask(); > > numa_sched_getaffinity(getpid(), curmask); > > > > /* > > * Clear bits that are not set in both the cpuset from the > > * environment, and in the user specified affinity. > > */ > > for (i = 0; i < max_cpus; i++) { > > if ((!numa_bitmask_isbitset(cpumask, i)) || > > (!numa_bitmask_isbitset(curmask, i))) > > numa_bitmask_clearbit(cpumask, i); > > } > > > > numa_bitmask_free(curmask); > > } > > > > Consider 8 CPU system booted with isolcpus=3-7, and execution of > > "cyclictest -a 3-7". > > > > sched_getaffinity() returns mask with bits set for CPUs 0 and 1. > > The user supplied mask has bits 3-7 set. > > > > The intersection between the user supplied mask and the affinity mask > > from the run time environment has no bits set. > > Okay. But does this make to keep? I understand that the current CPU-mask > needs to be kept for masks like !B or !B-D. But is there a need to use > the current CPU-mask when a specific mask has been specified by the user? Ok, then: 1) If user specifies -a CPULIST, ignore sched_getaffinity(). 2) If user specifies -a, or -a !CPULIST, use sched_getaffinity(). Will send a patch.
diff --git a/src/lib/rt-numa.c b/src/lib/rt-numa.c index ee5ab99..3106f1e 100644 --- a/src/lib/rt-numa.c +++ b/src/lib/rt-numa.c @@ -9,6 +9,7 @@ #include <errno.h> #include <sched.h> #include <pthread.h> +#include <stdlib.h> #include "rt-error.h" #include "rt-numa.h" @@ -99,11 +100,20 @@ int cpu_for_thread_ua(int thread_num, int max_cpus) static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) { struct bitmask *curmask; + char *ignore_affinity_mask; int i; curmask = numa_allocate_cpumask(); numa_sched_getaffinity(getpid(), curmask); + ignore_affinity_mask = getenv("IGNORE_RUNTIME_CPU_AFFINITY_MASK"); + if (ignore_affinity_mask && *ignore_affinity_mask != '0') { + int conf_cpus = numa_num_configured_cpus(); + + for (i = 0; i < conf_cpus; i++) + numa_bitmask_setbit(curmask, i); + } + /* * Clear bits that are not set in both the cpuset from the * environment, and in the user specified affinity.
use_current_cpuset() function does: /* * After this function is called, affinity_mask is the intersection of * the user supplied affinity mask and the affinity mask from the run * time environment */ static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) However, when using isolcpus kernel command line option, the CPUs specificied at isolcpus= are not part of the run time environment cpumask. This causes "cyclictest -a isolatedcpus" to fail with: WARN: Couldn't setaffinity in main thread: Invalid argument FATAL: No allowable cpus to run on # /dev/cpu_dma_latency set to 0us To fix this, add an environment variable IGNORE_RUNTIME_CPU_AFFINITY_MASK that when set to a value other than 0, will override the runtime cpu affinity mask (retrieved with numa_sched_getaffinity) with a bit set for each CPU in numa_num_configured_cpus: numa_num_configured_cpus() returns the number of cpus in the system. This count includes any cpus that are currently disabled. This count is derived from the cpu numbers in /sys/devices/system/cpu. If the kernel is configured without /sys (CONFIG_SYSFS=n) then it falls back to using the number of online cpus. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>