Message ID | 20240410045417.3048209-1-quic_atulpant@quicinc.com |
---|---|
Headers | show |
Series | Disable RT-throttling for idle-inject threads | expand |
On 4/10/24 06:54, Atul Pant wrote: > To prevent the throttling of RT idle threads, like the idle-inject > threads, skip accounting of runtime for these threads. > > Signed-off-by: Atul Pant <quic_atulpant@quicinc.com> > --- > kernel/sched/rt.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 4ac36eb4cdee..d20999270e75 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -1075,7 +1075,9 @@ static void update_curr_rt(struct rq *rq) > struct rt_rq *rt_rq = rt_rq_of_se(rt_se); > int exceeded; > > - if (sched_rt_runtime(rt_rq) != RUNTIME_INF) { > + if (sched_rt_runtime(rt_rq) != RUNTIME_INF && > + !(curr->policy == SCHED_FIFO && > + curr->flags & PF_IDLE)) { FYI, this will not be a problem with dl server because play_idle_precise() disables preemption, so the dl server will not be scheduled until preempt_enable(). with the DL server, the time consumed as an rt task will not change the DL server behavior because the logic inverts: it provides bandwidth for the fair scheduler (instead of throttling the RT sched). -- Daniel > raw_spin_lock(&rt_rq->rt_runtime_lock); > rt_rq->rt_time += delta_exec; > exceeded = sched_rt_runtime_exceeded(rt_rq);
On Wed, Apr 10, 2024 at 10:24:15AM +0530, Atul Pant wrote: > We are trying to implement a solution for thermal mitigation by using > idle injection on CPUs. However we face some limitations with the > current idle-inject framework. As per our need, we want to start > injecting idle cycles on a cpu for indefinite time (until the > temperature/power of the CPU falls below a threshold). This will allow > to keep the hot CPUs in the sleep state until we see improvement in > temperature/power. If we set idle duration to a large value or have an > idle-injection ratio of 100%, then the idle-inject RT thread suffers > from RT throttling. This results in the CPU exiting from the sleep state > and consume some power. > > To solve this limitation, we propose a solution to disable RT-throttling > whenever idle-inject threads run. We achieve this by not accounting the > runtime for the idle-inject threads. Running RT tasks for indefinite amounts of time will wreck the system. Things like workqueues and other per-cpu threads expect service or things will pile up and run to ground. Idle injection, just like every other RT user must not be able to starve the system of service. If your system design requires this (I would argue it is broken), look at other means, like CPU-hotplug (which I also really detest) -- which takes down the CPU in a controlled manner and avoids the resource issues.
On Wed, Apr 10, 2024 at 04:59:33PM +0530, Atul Kumar Pant wrote: > Hi Peter, > We are trying to add support for true 100% idle-injection ratio from > idle-injection framework. Yeah, I got that. I'm saying that is broken. Both from a requirement POV and an implementation POV. If your hardware needs 100% idle injection that means CPU availability is unreliable and everything that relies on CPU-masks will be broken. Furthermore, since idle-injection is build on top of FIFO, anything with a higher priority (DL, stop) will simply preempt it anyway. And, as already mentioned, hogging the system with FIFO will break things. So I would *very* strongly urge you to push back on whoever thought this is 'needed'. This is *bad* hardware. > It might happen that we want to run idle cycles for > slightly more time than permitted by RT-bandwidth control. The thing is configurable for a reason. > We understand the > concern about it hogging the cpu. Will it be better if we make it a choice for > the user who uses idle-inject framework, whether to have true 100% > idle-injection support or not? None of the above mentioned issues will magically go away. Run a 100% FIFO task for an indeterminate amount of time and you get to keep the pieces. Also, we'll be replacing the throttling code with DL servers 'soonish' at which point all this will stop working anyway, since DL will preempt anything FIFO, including your idle injection crud.
On 4/10/24 13:46, Peter Zijlstra wrote: > Also, we'll be replacing the throttling code with DL servers 'soonish' > at which point all this will stop working anyway, since DL will preempt > anything FIFO, including your idle injection crud. +1 also, given that the code spins with preempt disabled, with dl server it could even become a non-rt thread... FIFO RUNNING DL_SERVER activates their loop disables preemption() run() enable preemption() DL_SERVE throttled FIFO BACK So, there will be no need for this busy loop to be RT. Anyways, it breaks RT and DL if it keeps running for too long... It can also cause complaints like RCU stalls and loong wait on locks, e.g., on kworkers... -- Daniel
On Wed, 10 Apr 2024 10:24:15 +0530 Atul Pant <quic_atulpant@quicinc.com> wrote: > We are trying to implement a solution for thermal mitigation by using > idle injection on CPUs. However we face some limitations with the > current idle-inject framework. As per our need, we want to start > injecting idle cycles on a cpu for indefinite time (until the > temperature/power of the CPU falls below a threshold). This will allow > to keep the hot CPUs in the sleep state until we see improvement in > temperature/power. If we set idle duration to a large value or have an > idle-injection ratio of 100%, then the idle-inject RT thread suffers > from RT throttling. This results in the CPU exiting from the sleep state > and consume some power. > > To solve this limitation, we propose a solution to disable RT-throttling > whenever idle-inject threads run. We achieve this by not accounting the > runtime for the idle-inject threads. I'm going to assume that when dl-server is finally accepted, this will no longer be an issue for you? https://lore.kernel.org/all/cover.1699095159.git.bristot@kernel.org/ -- Steve