Message ID | 20240823201317.156379-1-joshua.hahn6@gmail.com |
---|---|
Headers | show |
Series | Exposing nice CPU usage to userspace | expand |
Hello. On Fri, Aug 23, 2024 at 01:05:16PM GMT, JoshuaHahnjoshua.hahn6@gmail.com wrote: > Niced CPU usage is a metric reported in host-level /proc/stat, but is > not reported in cgroup-level statistics in cpu.stat. However, when a > host contains multiple tasks across different workloads, it becomes > difficult to gauage how much of the task is being spent on niced > processes based on /proc/stat alone, since host-level metrics do not > provide this cgroup-level granularity. The difference between the two metrics is in cputime.c: index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER; > Exposing this metric will allow load balancers to correctly probe the > niced CPU metric for each workload, and make more informed decisions > when directing higher priority tasks. How would this work? (E.g. if too little nice time -> reduce priority of high prio tasks?) Thanks, Michal
Hello, thank you for reviewing the patch. On Mon, Aug 26, 2024 at 10:43 AM Michal Koutný <mkoutny@suse.com> wrote: > The difference between the two metrics is in cputime.c: > index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER; > > > Exposing this metric will allow load balancers to correctly probe the > > niced CPU metric for each workload, and make more informed decisions > > when directing higher priority tasks. > > How would this work? (E.g. if too little nice time -> reduce priority > of high prio tasks?) We can find what fraction of the task is being run as a nice process by dividing the two metrics (nice / user) and determining the fraction of niceness. When a high prio task comes into the load balancer and must decide where the task should be delegated to, the balancer can use how much of the task is nice as one factor in making the decision. The reverse is also true; host-level information in /proc/stat may indicate that a high percentage of CPU time is being used by nice processes, giving an illusion that all tasks within the host are running nice processes, when in reality, it is just one task that is using a lot of nice CPU time, and other tasks are running nonnice tasks. By including cgroup-level nice statistics, we can get a clearer picture and avoid overloading a host with too many high prio tasks. Like you suggested, this information can also help in re-prioritizing the processes, which may help high prio tasks become executed quicker. Thanks, Joshua
Hello, On Fri, Aug 23, 2024 at 01:05:17PM -0700, JoshuaHahnjoshua.hahn6@gmail.com wrote: > From: Joshua Hahn <joshua.hahn6@gmail.com> > > Cgroup-level CPU statistics currently include time spent on > user/system processes, but do not include niced CPU time (despite > already being tracked). This patch exposes niced CPU time to the > userspace, allowing users to get a better understanding of their > hardware limits and can facilitate better load-balancing. You aren't talking about the in-kernel scheduler's load balancer, right? If so, can you please update the description? This is a bit too confusing for a commit message for a kernel commit. Thanks.
Hello, thank you for reviewing the patch. > > Cgroup-level CPU statistics currently include time spent on > > user/system processes, but do not include niced CPU time (despite > > already being tracked). This patch exposes niced CPU time to the > > userspace, allowing users to get a better understanding of their > > hardware limits and can facilitate better load-balancing. > > You aren't talking about the in-kernel scheduler's load balancer, right? If > so, can you please update the description? This is a bit too confusing for a > commit message for a kernel commit. Thank you for pointing this out -- I'll edit the commit message to the following in a v2: Cgroup-level CPU statistics currently include time spent on user/system processes, but do not include niced CPU time (despite already being tracked). This patch exposes niced CPU time to the userspace, allowing users to get a better understanding of their hardware limits and can facilitate more informed workload distribution. Thanks, Joshua
From: Joshua Hahn <joshua.hahn6@gmail.com> Niced CPU usage is a metric reported in host-level /proc/stat, but is not reported in cgroup-level statistics in cpu.stat. However, when a host contains multiple tasks across different workloads, it becomes difficult to gauage how much of the task is being spent on niced processes based on /proc/stat alone, since host-level metrics do not provide this cgroup-level granularity. Exposing this metric will allow load balancers to correctly probe the niced CPU metric for each workload, and make more informed decisions when directing higher priority tasks. Joshua Hahn (2): Tracking cgroup-level niced CPU time Selftests for niced CPU statistics include/linux/cgroup-defs.h | 1 + kernel/cgroup/rstat.c | 16 ++++- tools/testing/selftests/cgroup/test_cpu.c | 72 +++++++++++++++++++++++ 3 files changed, 86 insertions(+), 3 deletions(-)