From patchwork Wed Sep 21 10:14:48 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dietmar Eggemann <dietmar.eggemann@arm.com>
X-Patchwork-Id: 76680
Delivered-To: patch@linaro.org
Received: by 10.140.106.72 with SMTP id d66csp1964186qgf;
 Wed, 21 Sep 2016 03:14:58 -0700 (PDT)
X-Received: by 10.66.131.74 with SMTP id ok10mr27373995pab.126.1474452898737; 
 Wed, 21 Sep 2016 03:14:58 -0700 (PDT)
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 u5si40140386pau.218.2016.09.21.03.14.58; 
 Wed, 21 Sep 2016 03:14:58 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1756083AbcIUKO4 (ORCPT <rfc822;julien.grall@linaro.org>
 + 27 others); Wed, 21 Sep 2016 06:14:56 -0400
Received: from foss.arm.com ([217.140.101.70]:44340 "EHLO foss.arm.com"
 rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
 id S1751387AbcIUKOy (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
 Wed, 21 Sep 2016 06:14:54 -0400
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 347E117C;
 Wed, 21 Sep 2016 03:14:54 -0700 (PDT)
Received: from [10.1.210.41] (e107985-lin.cambridge.arm.com [10.1.210.41])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id
 BA84F3F21A; Wed, 21 Sep 2016 03:14:52 -0700 (PDT)
Subject: Re: [PATCH 2/7 v3] sched: fix hierarchical order in
 rq->leaf_cfs_rq_list
To: Vincent Guittot <vincent.guittot@linaro.org>, peterz@infradead.org,
 mingo@kernel.org, linux-kernel@vger.kernel.org,
 yuyang.du@intel.com, Morten.Rasmussen@arm.com
References: <1473666472-13749-1-git-send-email-vincent.guittot@linaro.org>
 <1473666472-13749-3-git-send-email-vincent.guittot@linaro.org>
Cc: linaro-kernel@lists.linaro.org, pjt@google.com, bsegall@google.com
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
Message-ID: <a8b6d188-060c-9af9-177b-55059738e99c@arm.com>
Date: Wed, 21 Sep 2016 11:14:48 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <1473666472-13749-3-git-send-email-vincent.guittot@linaro.org>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi Vincent,

On 12/09/16 08:47, Vincent Guittot wrote:
> Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that
> a child will always be called before its parent.
> 
> The hierarchical order in shares update list has been introduced by
> commit 67e86250f8ea ("sched: Introduce hierarchal order on shares update list")
> 
> With the current implementation a child can be still put after its parent.
> 
> Lets take the example of
>        root
>         \
>          b
>          /\
>          c d*
>            |
>            e*
> 
> with root -> b -> c already enqueued but not d -> e so the leaf_cfs_rq_list
> looks like: head -> c -> b -> root -> tail
> 
> The branch d -> e will be added the first time that they are enqueued,
> starting with e then d.
> 
> When e is added, its parents is not already on the list so e is put at the
> tail : head -> c -> b -> root -> e -> tail
> 
> Then, d is added at the head because its parent is already on the list:
> head -> d -> c -> b -> root -> e -> tail
> 
> e is not placed at the right position and will be called the last whereas
> it should be called at the beginning.
> 
> Because it follows the bottom-up enqueue sequence, we are sure that we
> will finished to add either a cfs_rq without parent or a cfs_rq with a parent
> that is already on the list. We can use this event to detect when we have
> finished to add a new branch. For the others, whose parents are not already
> added, we have to ensure that they will be added after their children that
> have just been inserted the steps before, and after any potential parents that
> are already in the list. The easiest way is to put the cfs_rq just after the
> last inserted one and to keep track of it untl the branch is fully added.

[...]

I tested it with a multi-level task group hierarchy and it does the
right thing for this testcase (task t runs alternately on Cpu0 and Cpu1
in tg w/ tg_css_id=6) in a multi-level task group hierarchy (tg_css_id=2,4,6).

I wonder if this patch is related to the comment "TODO: fix up out-of-order
children on enqueue." in update_shares_cpu() of commit 82958366cfea
("sched: Replace update_shares weight distribution with per-entity
computation")?

I guess in the meantime we lost the functionality to remove a cfs_rq from the
leaf_cfs_rq list once there are no se's enqueued on it anymore. If e.g. t migrates
away from Cpu1, all the cfs_rq's of the task hierarchy (for tg_css_id=2,4,6)
owned by Cpu1 stay in the leaf_cfs_rq list.

Shouldn't we reintegrate it? Following patch goes on top of this patch:

-- >8 --

From: Dietmar Eggemann <dietmar.eggemann@arm.com>
Date: Tue, 20 Sep 2016 17:30:09 +0100
Subject: [PATCH] Re-integrate list_del_leaf_cfs_rq() into
 update_blocked_averages()

There is no functionality anymore to delete a cfs_rq from the leaf_cfs_rq
list in case there are no se's enqueued on it. 

The functionality had been initially introduced by commit 82958366cfea
("sched: Replace update_shares weight distribution with per-entity
computation") but has been taken out by commit 9d89c257dfb9 ("sched/fair:
Rewrite runnable load and utilization average tracking").

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/fair.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
1.9.1

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index dd530359ef84..951c83337e2b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6584,8 +6584,12 @@ static void update_blocked_averages(int cpu)
                        update_tg_load_avg(cfs_rq, 0); 
 
                /* Propagate pending load changes to the parent */
-               if (cfs_rq->tg->se[cpu])
+               if (cfs_rq->tg->se[cpu]) {
                        update_load_avg(cfs_rq->tg->se[cpu], 0, 0); 
+
+                       if (!cfs_rq->nr_running)
+                               list_del_leaf_cfs_rq(cfs_rq);
+               }
        }
        raw_spin_unlock_irqrestore(&rq->lock, flags);
 }