Message ID | cover.1561523542.git.viresh.kumar@linaro.org |
---|---|
Headers | show |
Series | sched/fair: Fallback to sched-idle CPU in absence of idle CPUs | expand |
On Wed, Jun 26, 2019 at 10:36:28AM +0530, Viresh Kumar wrote: > Hi, > > We try to find an idle CPU to run the next task, but in case we don't > find an idle CPU it is better to pick a CPU which will run the task the > soonest, for performance reason. > > A CPU which isn't idle but has only SCHED_IDLE activity queued on it > should be a good target based on this criteria as any normal fair task > will most likely preempt the currently running SCHED_IDLE task > immediately. In fact, choosing a SCHED_IDLE CPU over a fully idle one > shall give better results as it should be able to run the task sooner > than an idle CPU (which requires to be woken up from an idle state). > > This patchset updates both fast and slow paths with this optimization. So this basically does the trivial SCHED_IDLE<-* wakeup preemption test; one could consider doing the full wakeup preemption test instead. Now; the obvious argument against doing this is cost; esp. the cgroup case is very expensive I suppose. But it might be a fun experiment to try. That said; I'm tempted to apply these patches..
On 01-07-19, 15:43, Peter Zijlstra wrote: > On Wed, Jun 26, 2019 at 10:36:28AM +0530, Viresh Kumar wrote: > > Hi, > > > > We try to find an idle CPU to run the next task, but in case we don't > > find an idle CPU it is better to pick a CPU which will run the task the > > soonest, for performance reason. > > > > A CPU which isn't idle but has only SCHED_IDLE activity queued on it > > should be a good target based on this criteria as any normal fair task > > will most likely preempt the currently running SCHED_IDLE task > > immediately. In fact, choosing a SCHED_IDLE CPU over a fully idle one > > shall give better results as it should be able to run the task sooner > > than an idle CPU (which requires to be woken up from an idle state). > > > > This patchset updates both fast and slow paths with this optimization. > > So this basically does the trivial SCHED_IDLE<-* wakeup preemption test; Right. > one could consider doing the full wakeup preemption test instead. I am not sure what you meant by "full wakeup preemption test" :( > Now; the obvious argument against doing this is cost; esp. the cgroup > case is very expensive I suppose. But it might be a fun experiment to > try. > That said; I'm tempted to apply these patches.. Please do, who is stopping you :) -- viresh
On Wed, 26 Jun 2019 at 13:07, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Hi, > > We try to find an idle CPU to run the next task, but in case we don't > find an idle CPU it is better to pick a CPU which will run the task the > soonest, for performance reason. > > A CPU which isn't idle but has only SCHED_IDLE activity queued on it > should be a good target based on this criteria as any normal fair task > will most likely preempt the currently running SCHED_IDLE task > immediately. In fact, choosing a SCHED_IDLE CPU over a fully idle one > shall give better results as it should be able to run the task sooner > than an idle CPU (which requires to be woken up from an idle state). > > This patchset updates both fast and slow paths with this optimization. > > Testing is done with the help of rt-app currently and here are the > details: > > - Tested on Octacore Hikey platform (all CPUs change frequency > together). > > - rt-app json [1] creates few tasks and we monitor the scheduling > latency for them by looking at "wu_lat" field (usec). > > - The histograms are created using > https://github.com/adkein/textogram: textogram -a 0 -z 1000 -n 10 > > - the stats are accumulated using: https://github.com/nferraz/st Hi Viresh, Thanks for the great work! Could you give the whole commad-line for us testing? Wanpeng
On 09-12-19, 11:50, Wanpeng Li wrote: > On Wed, 26 Jun 2019 at 13:07, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > Hi, > > > > We try to find an idle CPU to run the next task, but in case we don't > > find an idle CPU it is better to pick a CPU which will run the task the > > soonest, for performance reason. > > > > A CPU which isn't idle but has only SCHED_IDLE activity queued on it > > should be a good target based on this criteria as any normal fair task > > will most likely preempt the currently running SCHED_IDLE task > > immediately. In fact, choosing a SCHED_IDLE CPU over a fully idle one > > shall give better results as it should be able to run the task sooner > > than an idle CPU (which requires to be woken up from an idle state). > > > > This patchset updates both fast and slow paths with this optimization. > > > > Testing is done with the help of rt-app currently and here are the > > details: > > > > - Tested on Octacore Hikey platform (all CPUs change frequency > > together). > > > > - rt-app json [1] creates few tasks and we monitor the scheduling > > latency for them by looking at "wu_lat" field (usec). > > > > - The histograms are created using > > https://github.com/adkein/textogram: textogram -a 0 -z 1000 -n 10 > > > > - the stats are accumulated using: https://github.com/nferraz/st > > Hi Viresh, > > Thanks for the great work! Could you give the whole commad-line for us testing? The rt-app json [1] can be run using: $ rt-app sched-idle.json This will create couple of files named rt-app-cfs_thread-*.log and rt-app-idle_thread-*.log. I looked mostly at the cfs files here as that's what we were looking for. We will be interested only in the last column of these files, which says "wu_lat". Remove everything apart from that column (and remove the first row as well, which has field names) from all cfs files (or write a sed/awk command to do it for you. After that I you can generate the numbers (mean/max/min/etc) using: $ st rt-app-cfs_thread-*.log -- viresh [1] https://pastebin.com/TMHGGBxD
On Tue, 10 Dec 2019 at 14:33, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 09-12-19, 11:50, Wanpeng Li wrote: > > On Wed, 26 Jun 2019 at 13:07, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > Hi, > > > > > > We try to find an idle CPU to run the next task, but in case we don't > > > find an idle CPU it is better to pick a CPU which will run the task the > > > soonest, for performance reason. > > > > > > A CPU which isn't idle but has only SCHED_IDLE activity queued on it > > > should be a good target based on this criteria as any normal fair task > > > will most likely preempt the currently running SCHED_IDLE task > > > immediately. In fact, choosing a SCHED_IDLE CPU over a fully idle one > > > shall give better results as it should be able to run the task sooner > > > than an idle CPU (which requires to be woken up from an idle state). > > > > > > This patchset updates both fast and slow paths with this optimization. > > > > > > Testing is done with the help of rt-app currently and here are the > > > details: > > > > > > - Tested on Octacore Hikey platform (all CPUs change frequency > > > together). > > > > > > - rt-app json [1] creates few tasks and we monitor the scheduling > > > latency for them by looking at "wu_lat" field (usec). > > > > > > - The histograms are created using > > > https://github.com/adkein/textogram: textogram -a 0 -z 1000 -n 10 > > > > > > - the stats are accumulated using: https://github.com/nferraz/st > > > > Hi Viresh, > > > > Thanks for the great work! Could you give the whole commad-line for us testing? > > The rt-app json [1] can be run using: > > $ rt-app sched-idle.json > > This will create couple of files named rt-app-cfs_thread-*.log and > rt-app-idle_thread-*.log. I looked mostly at the cfs files here as that's what > we were looking for. We will be interested only in the last column of these > files, which says "wu_lat". Remove everything apart from that column (and remove > the first row as well, which has field names) from all cfs files (or write a > sed/awk command to do it for you. > > After that I you can generate the numbers (mean/max/min/etc) using: > > $ st rt-app-cfs_thread-*.log Thanks for pointing out this. Wanpeng