[BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets

From: Juri Lelli <juri.lelli@arm.com>

Hi Steve,

On 04/02/16 12:27, Juri Lelli wrote:
> On 04/02/16 12:04, Juri Lelli wrote:

> > On 04/02/16 09:54, Juri Lelli wrote:

> > > Hi Steve,

> > > 

> > > first of all thanks a lot for your detailed report, if only all bug

> > > reports were like this.. :)

> > > 

> > > On 03/02/16 13:55, Steven Rostedt wrote:

> > 

> > [...]

> > 

> > > 

> > > Right. I think this is the same thing that happens after hotplug. IIRC

> > > the code paths are actually the same. The problem is that hotplug or

> > > cpuset reconfiguration operations are destructive w.r.t. root_domains,

> > > so we lose bandwidth information when that happens. The problem is that

> > > we only store cumulative information regarding bandwidth in root_domain,

> > > while information about which task belongs to which cpuset is store in

> > > cpuset data structures.

> > > 

> > > I tried to fix this a while back, but my tentative was broken, I failed

> > > to get locking right and, even though it seemed to fix the issue for me,

> > > it was prone to race conditions. You might still want to have a look at

> > > that for reference: https://lkml.org/lkml/2015/9/2/162

> > > 

> > 

> > [...]

> > 

> > > 

> > > It's good that we can recover, but that's still a bug yes :/.

> > > 

> > > I'll try to see if my broken patch make what you are seeing apparently

> > > disappear, so that we can at least confirm that we are seeing the same

> > > problem; you could do the same if you want, I pushed that here

> > > 

> > 

> > No it doesn't solve this :/. I placed restoring code in the hotplug

> > workfn, so updates generated by toggling sched_load_balance don't get

> > caught, of course. But, this at least tells us that we should solve this

> > someplace else.

> > 

> 

> Well, if I call an unlocked version of my cpuset_hotplug_update_rd()

> from kernel/cpuset.c:update_flag() the issue seems to go away. But, we

> end up overcommitting the default null domain (try to toggle sched_load_

> balance multiple times). I updated the branch, but I still think we

> should solve this differently.

> 

I've actually changed a bit this approach, and things seem better here.
Could you please give this a try? (You can also fetch the same branch).

Thanks,

- Juri

--->8---

From c45d255859a2978a350bae39ead52f4dd11ab767 Mon Sep 17 00:00:00 2001
From: Juri Lelli <juri.lelli@arm.com>

Date: Tue, 28 Jul 2015 11:55:51 +0100
Subject: [PATCH] sched/{cpuset,core}: restore root_domain status across
 destructive ops

Hotplug and sched_domains update operations are destructive w.r.t data
associated with cpuset; in this case we care about root_domains.
SCHED_DEADLINE puts bandwidth information regarding admitted tasks on
root_domains, information that is gone when an hotplug or update
operation happens. Also, it is not currently possible to tell to which
task(s) the allocated bandwidth belongs, as this link is lost after
sched_setscheduler() succeeds.

This patch forces rebuilding of allocated bandwidth information at
root_domain level after partition_sched_domains() is done. It also
ensures that we don't leave stale information in def_root_domain when
that becomes empty (since it is never freed).

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: cgroups@vger.kernel.org
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>

---
 include/linux/sched.h |  2 ++
 kernel/cpuset.c       | 39 +++++++++++++++++++++++++++++++++++++++
 kernel/sched/core.c   | 28 ++++++++++++++++++++++++++++
 3 files changed, 69 insertions(+)

-- 
2.7.0

[BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets

Commit Message

Comments

Patch