From patchwork Thu Feb 8 11:55:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771152 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5115A5427E; Thu, 8 Feb 2024 11:56:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393377; cv=none; b=K93qfj+r6H7SBJgXTxc0j32FTr2wXNqZAvD6G3gan2x7arVMo4+ois6nixR61/nOvorDR651qaZ1C+utbkwPE53uAIEH+NNoA4c9MRTNOUsug3SAL1dzke3fGK+t1L1GopiU/c+AOkc1cubfMoQ8Cp2nkE+siz9n5wXAIqAbBG0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393377; c=relaxed/simple; bh=vFMblhCLavHFIcqb3d9HPoVvxX4CxDbS2BZT1wAG2Rs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rJz5/EC2Os2br+fyerKhK26+1d5H93XJSwyt6lGOCj2UToxN2WHFa6dgtY2fZ0SjFC0kLXnhu3Cz2cKO7lTxFWWeMhkZlprgQkcsVxjy7JwOYEwFe/2YSPBVdNZD7Kg8yGSrkWfh7WlMhHTT6VCC5QPi0WbXcA1RBCwIIb2FOho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D9E641FB; Thu, 8 Feb 2024 03:56:56 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D51B93F5A1; Thu, 8 Feb 2024 03:56:11 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com, Hongyan Xia Subject: [PATCH v8 03/23] PM: EM: Find first CPU active while updating OPP efficiency Date: Thu, 8 Feb 2024 11:55:37 +0000 Message-Id: <20240208115557.1273962-4-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The Energy Model might be updated at runtime and the energy efficiency for each OPP may change. Thus, there is a need to update also the cpufreq framework and make it aligned to the new values. In order to do that, use a first active CPU from the Performance Domain. This is needed since the first CPU in the cpumask might be offline when we run this code path. Reviewed-by: Hongyan Xia Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- kernel/power/energy_model.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 8c373b151875..0c3220ff54f7 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -243,12 +243,19 @@ em_cpufreq_update_efficiencies(struct device *dev, struct em_perf_state *table) struct em_perf_domain *pd = dev->em_pd; struct cpufreq_policy *policy; int found = 0; - int i; + int i, cpu; if (!_is_cpu_device(dev)) return; - policy = cpufreq_cpu_get(cpumask_first(em_span_cpus(pd))); + /* Try to get a CPU which is active and in this PD */ + cpu = cpumask_first_and(em_span_cpus(pd), cpu_active_mask); + if (cpu >= nr_cpu_ids) { + dev_warn(dev, "EM: No online CPU for CPUFreq policy\n"); + return; + } + + policy = cpufreq_cpu_get(cpu); if (!policy) { dev_warn(dev, "EM: Access to CPUFreq policy failed\n"); return; From patchwork Thu Feb 8 11:55:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771151 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5388A6F51E; Thu, 8 Feb 2024 11:56:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393383; cv=none; b=m/gV6ZvxTClVTM5pY260QR0RyQImfyDZjnqAhAoHDQC3DMKjAhsHuB+GPqcHeOD3kkOQMb74q8tCTlKva5S9/wULcvcsrDx9ShSO+ohBpW1B1rn4O0ZT+KcyyJBic7RsYbbCEgS+M0rtrICaV5kpboE0GRYEv5bT+DYIb/ZuIKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393383; c=relaxed/simple; bh=bJjuu1EuWo74cFfQaOzHPNk7ZlfyPGiQ3J7BsnvcTcQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=eZ86kaXJxcZX402AsiOH0enAawl/CQhPSrgFJid671aHSFLFa9XR1tc+MJ0UQ1GZusso8MroHUguWWUVO+aKoKz+maAqZ10yC+2d/V/Ey4ZDOKPN/yMhl5DTcuwPLBlQ0nyj3OE7VqAjzlS1O++23Lb2r4lHt+sdS7PLflRTN1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E9504DA7; Thu, 8 Feb 2024 03:57:02 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 143B13F5A1; Thu, 8 Feb 2024 03:56:17 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 05/23] PM: EM: Introduce em_compute_costs() Date: Thu, 8 Feb 2024 11:55:39 +0000 Message-Id: <20240208115557.1273962-6-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Move the EM costs computation code into a new dedicated function, em_compute_costs(), that can be reused in other places in the future. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- kernel/power/energy_model.c | 72 ++++++++++++++++++++++--------------- 1 file changed, 43 insertions(+), 29 deletions(-) diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 0c3220ff54f7..5c47caaf270e 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -103,14 +103,52 @@ static void em_debug_create_pd(struct device *dev) {} static void em_debug_remove_pd(struct device *dev) {} #endif +static int em_compute_costs(struct device *dev, struct em_perf_state *table, + struct em_data_callback *cb, int nr_states, + unsigned long flags) +{ + unsigned long prev_cost = ULONG_MAX; + u64 fmax; + int i, ret; + + /* Compute the cost of each performance state. */ + fmax = (u64) table[nr_states - 1].frequency; + for (i = nr_states - 1; i >= 0; i--) { + unsigned long power_res, cost; + + if (flags & EM_PERF_DOMAIN_ARTIFICIAL) { + ret = cb->get_cost(dev, table[i].frequency, &cost); + if (ret || !cost || cost > EM_MAX_POWER) { + dev_err(dev, "EM: invalid cost %lu %d\n", + cost, ret); + return -EINVAL; + } + } else { + power_res = table[i].power; + cost = div64_u64(fmax * power_res, table[i].frequency); + } + + table[i].cost = cost; + + if (table[i].cost >= prev_cost) { + table[i].flags = EM_PERF_STATE_INEFFICIENT; + dev_dbg(dev, "EM: OPP:%lu is inefficient\n", + table[i].frequency); + } else { + prev_cost = table[i].cost; + } + } + + return 0; +} + static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, int nr_states, struct em_data_callback *cb, unsigned long flags) { - unsigned long power, freq, prev_freq = 0, prev_cost = ULONG_MAX; + unsigned long power, freq, prev_freq = 0; struct em_perf_state *table; int i, ret; - u64 fmax; table = kcalloc(nr_states, sizeof(*table), GFP_KERNEL); if (!table) @@ -154,33 +192,9 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, table[i].frequency = prev_freq = freq; } - /* Compute the cost of each performance state. */ - fmax = (u64) table[nr_states - 1].frequency; - for (i = nr_states - 1; i >= 0; i--) { - unsigned long power_res, cost; - - if (flags & EM_PERF_DOMAIN_ARTIFICIAL) { - ret = cb->get_cost(dev, table[i].frequency, &cost); - if (ret || !cost || cost > EM_MAX_POWER) { - dev_err(dev, "EM: invalid cost %lu %d\n", - cost, ret); - goto free_ps_table; - } - } else { - power_res = table[i].power; - cost = div64_u64(fmax * power_res, table[i].frequency); - } - - table[i].cost = cost; - - if (table[i].cost >= prev_cost) { - table[i].flags = EM_PERF_STATE_INEFFICIENT; - dev_dbg(dev, "EM: OPP:%lu is inefficient\n", - table[i].frequency); - } else { - prev_cost = table[i].cost; - } - } + ret = em_compute_costs(dev, table, cb, nr_states, flags); + if (ret) + goto free_ps_table; pd->table = table; pd->nr_perf_states = nr_states; From patchwork Thu Feb 8 11:55:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771150 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 73DCE7640C; Thu, 8 Feb 2024 11:56:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393389; cv=none; b=OumosfupNZKX472XKvzGp96Iodm/aMk+ZfqLwWBAoNx8YW/6uhnAeAgmoURp5mbftcs3R1RnYZkGEeMHjWQgiKOUt54UmHyQSfdVQz3nFBs+1vCehezs41ybjcQCMAWgwUvNRipj54oRUf6l+KD1nGTGBByy+RuU5cYmNkc9mlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393389; c=relaxed/simple; bh=uOfszU8onKghXMWUp5uW3OMPxsyNzrUd6JfObHueJo0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iE+BK5K2pVSUbOBhLD21F2BYaRLmg+5ayMrhV6z14K80pPyp9FsabPDVIjRpChLIlTgC9zUf4bKSXU8B2sYlui+as+KDBtNNRgyYi/drjl/AccGLTCpQdfhvjLrRnumTdTtESH4DaZXI8RNN3Zj4CVkdxWu8qldiNT47FdDjixA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 21E031FB; Thu, 8 Feb 2024 03:57:09 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 403393F5A1; Thu, 8 Feb 2024 03:56:24 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 07/23] PM: EM: Split the allocation and initialization of the EM table Date: Thu, 8 Feb 2024 11:55:41 +0000 Message-Id: <20240208115557.1273962-8-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Split the process of allocation and data initialization for the EM table. The upcoming changes for modifiable EM will use it. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- kernel/power/energy_model.c | 55 ++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 22 deletions(-) diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 21d761223255..7468fa92134b 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -142,18 +142,26 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table, return 0; } +static int em_allocate_perf_table(struct em_perf_domain *pd, + int nr_states) +{ + pd->table = kcalloc(nr_states, sizeof(struct em_perf_state), + GFP_KERNEL); + if (!pd->table) + return -ENOMEM; + + return 0; +} + static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, - int nr_states, struct em_data_callback *cb, + struct em_perf_state *table, + struct em_data_callback *cb, unsigned long flags) { unsigned long power, freq, prev_freq = 0; - struct em_perf_state *table; + int nr_states = pd->nr_perf_states; int i, ret; - table = kcalloc(nr_states, sizeof(*table), GFP_KERNEL); - if (!table) - return -ENOMEM; - /* Build the list of performance states for this performance domain */ for (i = 0, freq = 0; i < nr_states; i++, freq++) { /* @@ -165,7 +173,7 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, if (ret) { dev_err(dev, "EM: invalid perf. state: %d\n", ret); - goto free_ps_table; + return -EINVAL; } /* @@ -175,7 +183,7 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, if (freq <= prev_freq) { dev_err(dev, "EM: non-increasing freq: %lu\n", freq); - goto free_ps_table; + return -EINVAL; } /* @@ -185,7 +193,7 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, if (!power || power > EM_MAX_POWER) { dev_err(dev, "EM: invalid power: %lu\n", power); - goto free_ps_table; + return -EINVAL; } table[i].power = power; @@ -194,16 +202,9 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, ret = em_compute_costs(dev, table, cb, nr_states, flags); if (ret) - goto free_ps_table; - - pd->table = table; - pd->nr_perf_states = nr_states; + return -EINVAL; return 0; - -free_ps_table: - kfree(table); - return -EINVAL; } static int em_create_pd(struct device *dev, int nr_states, @@ -234,11 +235,15 @@ static int em_create_pd(struct device *dev, int nr_states, return -ENOMEM; } - ret = em_create_perf_table(dev, pd, nr_states, cb, flags); - if (ret) { - kfree(pd); - return ret; - } + pd->nr_perf_states = nr_states; + + ret = em_allocate_perf_table(pd, nr_states); + if (ret) + goto free_pd; + + ret = em_create_perf_table(dev, pd, pd->table, cb, flags); + if (ret) + goto free_pd_table; if (_is_cpu_device(dev)) for_each_cpu(cpu, cpus) { @@ -249,6 +254,12 @@ static int em_create_pd(struct device *dev, int nr_states, dev->em_pd = pd; return 0; + +free_pd_table: + kfree(pd->table); +free_pd: + kfree(pd); + return -EINVAL; } static void From patchwork Thu Feb 8 11:55:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771149 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 502D276C8C; Thu, 8 Feb 2024 11:56:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393395; cv=none; b=EHPF3sn6H6Tm5Oj+djv0oSMzvK3o5G/YbsRd8xIQgRracFad4RShO/PwpoW0h8q0dbW8xtqx70716cl9ztLaha07auqhbdiaFvPyqTHzaCiq3E2xLq4iJaTAx2Ye3WVAFw2/g+DeZrm8adeymspOMOZ5XLxTOj/FfDZsZ+xVkBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393395; c=relaxed/simple; bh=jSDNw/gdJcgEyLMwynWhjBDaJb3Vfg+ndN6jLvTWrnk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jsDiU52HOnBD2TH1ARYvj99ykGkU3sxuCKb3mdeZZjdW9DAe+lLnUFcObwUNvL0tj/REtwVfPsK9SL9rzaN/khhr6fimeGgxYcmkEdYdWL11FPoPAiqI62WZOea23at/scH5J3ANugeJ04znIWqHL4p+Bw56+Gmt5t8CvRFad3o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 06C7B1FB; Thu, 8 Feb 2024 03:57:15 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 274A53F5A1; Thu, 8 Feb 2024 03:56:30 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 09/23] PM: EM: Use runtime modified EM for CPUs energy estimation in EAS Date: Thu, 8 Feb 2024 11:55:43 +0000 Message-Id: <20240208115557.1273962-10-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The new Energy Model (EM) supports runtime modification of the performance state table to better model the power used by the SoC. Use this new feature to improve energy estimation and therefore task placement in Energy Aware Scheduler (EAS). Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- include/linux/energy_model.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index 8ddf1d8a9581..5f842da3bb0c 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -239,9 +239,14 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, unsigned long allowed_cpu_cap) { unsigned long freq, ref_freq, scale_cpu; + struct em_perf_table *em_table; struct em_perf_state *ps; int cpu, i; +#ifdef CONFIG_SCHED_DEBUG + WARN_ONCE(!rcu_read_lock_held(), "EM: rcu read lock needed\n"); +#endif + if (!sum_util) return 0; @@ -264,9 +269,10 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, * Find the lowest performance state of the Energy Model above the * requested frequency. */ - i = em_pd_get_efficient_state(pd->table, pd->nr_perf_states, freq, - pd->flags); - ps = &pd->table[i]; + em_table = rcu_dereference(pd->em_table); + i = em_pd_get_efficient_state(em_table->state, pd->nr_perf_states, + freq, pd->flags); + ps = &em_table->state[i]; /* * The capacity of a CPU in the domain at the performance state (ps) From patchwork Thu Feb 8 11:55:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771148 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0F5FD79DA1; Thu, 8 Feb 2024 11:56:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393400; cv=none; b=Dv1sxkBgJEmfImUZ0dDGCEsse6OlgCbyb1BhwiB6TFH2WrklUoZYLz+WHjFF7rF1FfXWHcqc9a9oDx/YfNxUd69wyHV+8ntrwo5mfoRZDbibP7ZBh5OdB92m+L+nftu1Umeu9qNi7FuB4BIQElXCBrq0R3t2dxZJPUkEkj291ck= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393400; c=relaxed/simple; bh=OoFrzVwdfR8MuA4B3/NSqIrR20qh/M0U8am5P3jts3o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gZ89Zw/TrzrjqAYHntSSVxAUjpQzvOEVvsK0iTIznzBhVqdoNOwdKbFSi0lgoX8h/Wede2GwYSOlIowuZ2ra2tg5AH0T+PCYso5o3sxBrT/Yfvg1fvroVSF5sdbesD/K2PBDUP4hjcytCBjCqhMTgsTvXf9wZ8dptuVvAUjCwp8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E565A1FB; Thu, 8 Feb 2024 03:57:20 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1013B3F5A1; Thu, 8 Feb 2024 03:56:35 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 11/23] PM: EM: Introduce em_dev_update_perf_domain() for EM updates Date: Thu, 8 Feb 2024 11:55:45 +0000 Message-Id: <20240208115557.1273962-12-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add API function em_dev_update_perf_domain() which allows the EM to be changed safely. Concurrent updaters are serialized with a mutex and the removal of memory that will not be used any more is carried out with the help of RCU. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- include/linux/energy_model.h | 8 +++++++ kernel/power/energy_model.c | 44 ++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index 27911dc1887e..324a3a8e0a2d 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -183,6 +183,8 @@ struct em_data_callback { struct em_perf_domain *em_cpu_get(int cpu); struct em_perf_domain *em_pd_get(struct device *dev); +int em_dev_update_perf_domain(struct device *dev, + struct em_perf_table __rcu *new_table); int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states, struct em_data_callback *cb, cpumask_t *span, bool microwatts); @@ -376,6 +378,12 @@ struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd) return NULL; } static inline void em_table_free(struct em_perf_table __rcu *table) {} +static inline +int em_dev_update_perf_domain(struct device *dev, + struct em_perf_table __rcu *new_table) +{ + return -EINVAL; +} #endif #endif diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 16795743f969..667619b70be7 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -209,6 +209,50 @@ static int em_allocate_perf_table(struct em_perf_domain *pd, return 0; } +/** + * em_dev_update_perf_domain() - Update runtime EM table for a device + * @dev : Device for which the EM is to be updated + * @new_table : The new EM table that is going to be used from now + * + * Update EM runtime modifiable table for the @dev using the provided @table. + * + * This function uses a mutex to serialize writers, so it must not be called + * from a non-sleeping context. + * + * Return 0 on success or an error code on failure. + */ +int em_dev_update_perf_domain(struct device *dev, + struct em_perf_table __rcu *new_table) +{ + struct em_perf_table __rcu *old_table; + struct em_perf_domain *pd; + + if (!dev) + return -EINVAL; + + /* Serialize update/unregister or concurrent updates */ + mutex_lock(&em_pd_mutex); + + if (!dev->em_pd) { + mutex_unlock(&em_pd_mutex); + return -EINVAL; + } + pd = dev->em_pd; + + kref_get(&new_table->kref); + + old_table = pd->em_table; + rcu_assign_pointer(pd->em_table, new_table); + + em_cpufreq_update_efficiencies(dev, new_table->state); + + em_table_free(old_table); + + mutex_unlock(&em_pd_mutex); + return 0; +} +EXPORT_SYMBOL_GPL(em_dev_update_perf_domain); + static int em_create_runtime_table(struct em_perf_domain *pd) { struct em_perf_table __rcu *table; From patchwork Thu Feb 8 11:55:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771147 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 27AA87B3DC; Thu, 8 Feb 2024 11:56:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393407; cv=none; b=rhAvbg7dm4cWC9VnglUoYiNvQvhaYFn00glE0aH7vE3ELYyWMiJcOM+T84SBnXg8p3V6aByFc+txsRCOb55p8TfwfIAR3ciMZf0wa5qHihvyI9Yn3ldAvXqBzhYFbuQMDxFJFJJzobTwUtGQGJy4wOmisOH8ATojT/EsTW4M7fg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393407; c=relaxed/simple; bh=WRg4wUePEUQi65klNCH4jCt1htraMOOT2ZZsRc/VHwA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=uoMbJpLFrF6YiWA89iSbGZ6ESCdlACNdHPfohqcO+QDAcK088/vY8/lCMibz9nr31kEoUcY5HsMKTNABCV7vO4/WNBf1U+rM0EyOM+8+6GMVHqJ1iQIO4U7polWV/PxRMTJTUXDp6ULYehRcrvlQAnFW3hOSKftkz0QUqqprUZw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0F5A21FB; Thu, 8 Feb 2024 03:57:27 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EC0ED3F5A1; Thu, 8 Feb 2024 03:56:41 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 13/23] PM: EM: Add performance field to struct em_perf_state and optimize Date: Thu, 8 Feb 2024 11:55:47 +0000 Message-Id: <20240208115557.1273962-14-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The performance doesn't scale linearly with the frequency. Also, it may be different in different workloads. Some CPUs are designed to be particularly good at some applications e.g. images or video processing and other CPUs in different. When those different types of CPUs are combined in one SoC they should be properly modeled to get max of the HW in Energy Aware Scheduler (EAS). The Energy Model (EM) provides the power vs. performance curves to the EAS, but assumes the CPUs capacity is fixed and scales linearly with the frequency. This patch allows to adjust the curve on the 'performance' axis as well. Code speed optimization: Removing map_util_freq() allows to avoid one division and one multiplication operations from the EAS hot code path. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- include/linux/energy_model.h | 24 ++++++++++++------------ kernel/power/energy_model.c | 27 +++++++++++++++++++++++++++ 2 files changed, 39 insertions(+), 12 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index 158dad6ea313..ce24ea3fe41c 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -13,6 +13,7 @@ /** * struct em_perf_state - Performance state of a performance domain + * @performance: CPU performance (capacity) at a given frequency * @frequency: The frequency in KHz, for consistency with CPUFreq * @power: The power consumed at this level (by 1 CPU or by a registered * device). It can be a total power: static and dynamic. @@ -21,6 +22,7 @@ * @flags: see "em_perf_state flags" description below. */ struct em_perf_state { + unsigned long performance; unsigned long frequency; unsigned long power; unsigned long cost; @@ -196,25 +198,25 @@ void em_table_free(struct em_perf_table __rcu *table); * em_pd_get_efficient_state() - Get an efficient performance state from the EM * @table: List of performance states, in ascending order * @nr_perf_states: Number of performance states - * @freq: Frequency to map with the EM + * @max_util: Max utilization to map with the EM * @pd_flags: Performance Domain flags * * It is called from the scheduler code quite frequently and as a consequence * doesn't implement any check. * - * Return: An efficient performance state id, high enough to meet @freq + * Return: An efficient performance state id, high enough to meet @max_util * requirement. */ static inline int em_pd_get_efficient_state(struct em_perf_state *table, int nr_perf_states, - unsigned long freq, unsigned long pd_flags) + unsigned long max_util, unsigned long pd_flags) { struct em_perf_state *ps; int i; for (i = 0; i < nr_perf_states; i++) { ps = &table[i]; - if (ps->frequency >= freq) { + if (ps->performance >= max_util) { if (pd_flags & EM_PERF_DOMAIN_SKIP_INEFFICIENCIES && ps->flags & EM_PERF_STATE_INEFFICIENT) continue; @@ -245,9 +247,9 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, unsigned long max_util, unsigned long sum_util, unsigned long allowed_cpu_cap) { - unsigned long freq, ref_freq, scale_cpu; struct em_perf_table *em_table; struct em_perf_state *ps; + unsigned long scale_cpu; int cpu, i; #ifdef CONFIG_SCHED_DEBUG @@ -260,25 +262,23 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, /* * In order to predict the performance state, map the utilization of * the most utilized CPU of the performance domain to a requested - * frequency, like schedutil. Take also into account that the real - * frequency might be set lower (due to thermal capping). Thus, clamp + * performance, like schedutil. Take also into account that the real + * performance might be set lower (due to thermal capping). Thus, clamp * max utilization to the allowed CPU capacity before calculating - * effective frequency. + * effective performance. */ cpu = cpumask_first(to_cpumask(pd->cpus)); scale_cpu = arch_scale_cpu_capacity(cpu); - ref_freq = arch_scale_freq_ref(cpu); max_util = min(max_util, allowed_cpu_cap); - freq = map_util_freq(max_util, ref_freq, scale_cpu); /* * Find the lowest performance state of the Energy Model above the - * requested frequency. + * requested performance. */ em_table = rcu_dereference(pd->em_table); i = em_pd_get_efficient_state(em_table->state, pd->nr_perf_states, - freq, pd->flags); + max_util, pd->flags); ps = &em_table->state[i]; /* diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 667619b70be7..41418aa6daa6 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -46,6 +46,7 @@ static void em_debug_create_ps(struct em_perf_state *ps, struct dentry *pd) debugfs_create_ulong("frequency", 0444, d, &ps->frequency); debugfs_create_ulong("power", 0444, d, &ps->power); debugfs_create_ulong("cost", 0444, d, &ps->cost); + debugfs_create_ulong("performance", 0444, d, &ps->performance); debugfs_create_ulong("inefficient", 0444, d, &ps->flags); } @@ -159,6 +160,30 @@ struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd) return table; } +static void em_init_performance(struct device *dev, struct em_perf_domain *pd, + struct em_perf_state *table, int nr_states) +{ + u64 fmax, max_cap; + int i, cpu; + + /* This is needed only for CPUs and EAS skip other devices */ + if (!_is_cpu_device(dev)) + return; + + cpu = cpumask_first(em_span_cpus(pd)); + + /* + * Calculate the performance value for each frequency with + * linear relationship. The final CPU capacity might not be ready at + * boot time, but the EM will be updated a bit later with correct one. + */ + fmax = (u64) table[nr_states - 1].frequency; + max_cap = (u64) arch_scale_cpu_capacity(cpu); + for (i = 0; i < nr_states; i++) + table[i].performance = div64_u64(max_cap * table[i].frequency, + fmax); +} + static int em_compute_costs(struct device *dev, struct em_perf_state *table, struct em_data_callback *cb, int nr_states, unsigned long flags) @@ -318,6 +343,8 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, table[i].frequency = prev_freq = freq; } + em_init_performance(dev, pd, table, nr_states); + ret = em_compute_costs(dev, table, cb, nr_states, flags); if (ret) return -EINVAL; From patchwork Thu Feb 8 11:55:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771146 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5229F7C6C8; Thu, 8 Feb 2024 11:56:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393413; cv=none; b=VwqoaV5KPQNe6FSS55Yep7XDj3TKxK2b9dG5nt2jGiEMB6xWKVmUit94ljHB0OxY5ZP1gpJEznZ8QQC3/+zZR79jHf8cHh8ddGUhSh+lasBRYVdyWw6oDRdDyHLeH/Vqx5gxOTaBJ6Wu7v4T6ej9vLNso9bDq7nO/6sWojgCXO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393413; c=relaxed/simple; bh=cfHrIfOrgdO4fQ34fhRtpW8gsAWZ5N5Ko4amh1UZZvg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Dki56+Qb0CR1Kau3+JyemuuKAg4odyq4ydfr1wBf5K7EP/sX+qsROjGtdjfs+M6PIncXOa7CkVgqHsLUA2hptzdjlyPO2hw4WFkEG/2nBKF/PxMiVC/PgLOt1AsIayCIjyxe7JpXJdT8FSu6x9P9J4EdMxK7XBxrB2MtIQQ19Jk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id ED370DA7; Thu, 8 Feb 2024 03:57:32 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 18DD53F5A1; Thu, 8 Feb 2024 03:56:47 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 15/23] PM: EM: Optimize em_cpu_energy() and remove division Date: Thu, 8 Feb 2024 11:55:49 +0000 Message-Id: <20240208115557.1273962-16-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The Energy Model (EM) can be modified at runtime which brings new possibilities. The em_cpu_energy() is called by the Energy Aware Scheduler (EAS) in its hot path. The energy calculation uses power value for a given performance state (ps) and the CPU busy time as percentage for that given frequency. It is possible to avoid the division by 'scale_cpu' at runtime, because EM is updated whenever new max capacity CPU is set in the system. Use that feature and do the needed division during the calculation of the coefficient 'ps->cost'. That enhanced 'ps->cost' value can be then just multiplied simply by utilization: pd_nrg = ps->cost * \Sum cpu_util to get the needed energy for whole Performance Domain (PD). With this optimization and earlier removal of map_util_freq(), the em_cpu_energy() should run faster on the Big CPU by 1.43x and on the Little CPU by 1.69x (RockPi 4B board). Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- include/linux/energy_model.h | 55 ++++++++++-------------------------- kernel/power/energy_model.c | 7 ++--- 2 files changed, 18 insertions(+), 44 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index ce24ea3fe41c..aabfc26fcd31 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -115,27 +115,6 @@ struct em_perf_domain { #define EM_MAX_NUM_CPUS 16 #endif -/* - * To avoid an overflow on 32bit machines while calculating the energy - * use a different order in the operation. First divide by the 'cpu_scale' - * which would reduce big value stored in the 'cost' field, then multiply by - * the 'sum_util'. This would allow to handle existing platforms, which have - * e.g. power ~1.3 Watt at max freq, so the 'cost' value > 1mln micro-Watts. - * In such scenario, where there are 4 CPUs in the Perf. Domain the 'sum_util' - * could be 4096, then multiplication: 'cost' * 'sum_util' would overflow. - * This reordering of operations has some limitations, we lose small - * precision in the estimation (comparing to 64bit platform w/o reordering). - * - * We are safe on 64bit machine. - */ -#ifdef CONFIG_64BIT -#define em_estimate_energy(cost, sum_util, scale_cpu) \ - (((cost) * (sum_util)) / (scale_cpu)) -#else -#define em_estimate_energy(cost, sum_util, scale_cpu) \ - (((cost) / (scale_cpu)) * (sum_util)) -#endif - struct em_data_callback { /** * active_power() - Provide power at the next performance state of @@ -249,8 +228,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, { struct em_perf_table *em_table; struct em_perf_state *ps; - unsigned long scale_cpu; - int cpu, i; + int i; #ifdef CONFIG_SCHED_DEBUG WARN_ONCE(!rcu_read_lock_held(), "EM: rcu read lock needed\n"); @@ -267,9 +245,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, * max utilization to the allowed CPU capacity before calculating * effective performance. */ - cpu = cpumask_first(to_cpumask(pd->cpus)); - scale_cpu = arch_scale_cpu_capacity(cpu); - + max_util = map_util_perf(max_util); max_util = min(max_util, allowed_cpu_cap); /* @@ -282,12 +258,12 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, ps = &em_table->state[i]; /* - * The capacity of a CPU in the domain at the performance state (ps) - * can be computed as: + * The performance (capacity) of a CPU in the domain at the performance + * state (ps) can be computed as: * - * ps->freq * scale_cpu - * ps->cap = -------------------- (1) - * cpu_max_freq + * ps->freq * scale_cpu + * ps->performance = -------------------- (1) + * cpu_max_freq * * So, ignoring the costs of idle states (which are not available in * the EM), the energy consumed by this CPU at that performance state @@ -295,9 +271,10 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, * * ps->power * cpu_util * cpu_nrg = -------------------- (2) - * ps->cap + * ps->performance * - * since 'cpu_util / ps->cap' represents its percentage of busy time. + * since 'cpu_util / ps->performance' represents its percentage of busy + * time. * * NOTE: Although the result of this computation actually is in * units of power, it can be manipulated as an energy value @@ -307,9 +284,9 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, * By injecting (1) in (2), 'cpu_nrg' can be re-expressed as a product * of two terms: * - * ps->power * cpu_max_freq cpu_util - * cpu_nrg = ------------------------ * --------- (3) - * ps->freq scale_cpu + * ps->power * cpu_max_freq + * cpu_nrg = ------------------------ * cpu_util (3) + * ps->freq * scale_cpu * * The first term is static, and is stored in the em_perf_state struct * as 'ps->cost'. @@ -319,11 +296,9 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd, * total energy of the domain (which is the simple sum of the energy of * all of its CPUs) can be factorized as: * - * ps->cost * \Sum cpu_util - * pd_nrg = ------------------------ (4) - * scale_cpu + * pd_nrg = ps->cost * \Sum cpu_util (4) */ - return em_estimate_energy(ps->cost, sum_util, scale_cpu); + return ps->cost * sum_util; } /** diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index b192b0ac8c6e..a631d7d52c40 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -192,11 +192,9 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table, unsigned long flags) { unsigned long prev_cost = ULONG_MAX; - u64 fmax; int i, ret; /* Compute the cost of each performance state. */ - fmax = (u64) table[nr_states - 1].frequency; for (i = nr_states - 1; i >= 0; i--) { unsigned long power_res, cost; @@ -208,8 +206,9 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table, return -EINVAL; } } else { - power_res = table[i].power; - cost = div64_u64(fmax * power_res, table[i].frequency); + /* increase resolution of 'cost' precision */ + power_res = table[i].power * 10; + cost = power_res / table[i].performance; } table[i].cost = cost; From patchwork Thu Feb 8 11:55:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771145 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EA7297D3ED; Thu, 8 Feb 2024 11:56:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393419; cv=none; b=pugkKUKYyjcHcYh+SoQlz4CLDEVgGBuGv5YFFPObHEjB9foixXnwwcl+RudmdiBgKwtaO2i9B6OewlooYGP3zyzEyefpiwbBJhDAtTW/Jy9GFO/TeFzKF2WLhMkIKFfVEfWjYAivYG3bzvba0xsfvy81KLGbWGACDE7H8WMvK04= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393419; c=relaxed/simple; bh=NpxfFJNwCfR0NV7bGJK2imc9EJ1KwDO/FM+S2qjTl28=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AkvOo4LqiYmWt9sLgrjnAvfl5k6c4c09qozFkSXBh+EhrGjWBg46iqo9OQLgdene638QPWgCMSKbmuk1ey/YjmEYx2+Og01eHmQGGAvH3aHIp8PuFkg4sykfJLFxt2YLM8J9FNYeJs1omONV6HzLAZHRq5jQIgtinpovWmOreIM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D326F1FB; Thu, 8 Feb 2024 03:57:38 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F260D3F5A1; Thu, 8 Feb 2024 03:56:53 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 17/23] powercap/dtpm_devfreq: Use new Energy Model interface to get table Date: Thu, 8 Feb 2024 11:55:51 +0000 Message-Id: <20240208115557.1273962-18-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Energy Model framework support modifications at runtime of the power values. Use the new EM table API which is protected with RCU. Align the code so that this RCU read section is short. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- drivers/powercap/dtpm_devfreq.c | 34 ++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/drivers/powercap/dtpm_devfreq.c b/drivers/powercap/dtpm_devfreq.c index 612c3b59dd5b..f40bce8176df 100644 --- a/drivers/powercap/dtpm_devfreq.c +++ b/drivers/powercap/dtpm_devfreq.c @@ -37,11 +37,16 @@ static int update_pd_power_uw(struct dtpm *dtpm) struct devfreq *devfreq = dtpm_devfreq->devfreq; struct device *dev = devfreq->dev.parent; struct em_perf_domain *pd = em_pd_get(dev); + struct em_perf_state *table; - dtpm->power_min = pd->table[0].power; + rcu_read_lock(); + table = em_perf_state_from_pd(pd); - dtpm->power_max = pd->table[pd->nr_perf_states - 1].power; + dtpm->power_min = table[0].power; + dtpm->power_max = table[pd->nr_perf_states - 1].power; + + rcu_read_unlock(); return 0; } @@ -51,20 +56,23 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit) struct devfreq *devfreq = dtpm_devfreq->devfreq; struct device *dev = devfreq->dev.parent; struct em_perf_domain *pd = em_pd_get(dev); + struct em_perf_state *table; unsigned long freq; int i; + rcu_read_lock(); + table = em_perf_state_from_pd(pd); for (i = 0; i < pd->nr_perf_states; i++) { - if (pd->table[i].power > power_limit) + if (table[i].power > power_limit) break; } - freq = pd->table[i - 1].frequency; + freq = table[i - 1].frequency; + power_limit = table[i - 1].power; + rcu_read_unlock(); dev_pm_qos_update_request(&dtpm_devfreq->qos_req, freq); - power_limit = pd->table[i - 1].power; - return power_limit; } @@ -89,8 +97,9 @@ static u64 get_pd_power_uw(struct dtpm *dtpm) struct device *dev = devfreq->dev.parent; struct em_perf_domain *pd = em_pd_get(dev); struct devfreq_dev_status status; + struct em_perf_state *table; unsigned long freq; - u64 power; + u64 power = 0; int i; mutex_lock(&devfreq->lock); @@ -100,19 +109,22 @@ static u64 get_pd_power_uw(struct dtpm *dtpm) freq = DIV_ROUND_UP(status.current_frequency, HZ_PER_KHZ); _normalize_load(&status); + rcu_read_lock(); + table = em_perf_state_from_pd(pd); for (i = 0; i < pd->nr_perf_states; i++) { - if (pd->table[i].frequency < freq) + if (table[i].frequency < freq) continue; - power = pd->table[i].power; + power = table[i].power; power *= status.busy_time; power >>= 10; - return power; + break; } + rcu_read_unlock(); - return 0; + return power; } static void pd_release(struct dtpm *dtpm) From patchwork Thu Feb 8 11:55:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771144 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id CC2F07E761; Thu, 8 Feb 2024 11:57:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393424; cv=none; b=NYGGLXOPm6jOIZv5aMIQOlLZ8gYa4xUWWLS6KF4Lc/kCwsl99mXvbfl7uLLThpcrHzsSzR20iEP6dmrWIKY5E7CwOTWPYlsusEEhhpvlomEVkXnnvzlw8sW3DUCRTXpudvGLhmPyoZL7qQfC+EHcnDZFuyK8/lQbQsUNIztREl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393424; c=relaxed/simple; bh=wkTlNeA3yVySpfdCDKAwYbrsIV5UQmUrBVPxFwqVSSo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LbhQ/wMVVff/bLlrpsEwbOmBTXh/XMx21ZH4cQyaKHvtposSQmPLuvR4oLMjOwcJ3S0Q9PS3lDsRVGo8/FdvQgHe921GdAf08npJV/boAwLE5MJDWvH9doBo1HGSQ/WmDq9yr8O7K7aj6Q9pczN0SB18Q/s8un4YcaaAj2vmk6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B5D031FB; Thu, 8 Feb 2024 03:57:44 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D750A3F5A1; Thu, 8 Feb 2024 03:56:59 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 19/23] drivers/thermal/devfreq_cooling: Use new Energy Model interface Date: Thu, 8 Feb 2024 11:55:53 +0000 Message-Id: <20240208115557.1273962-20-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Energy Model framework support modifications at runtime of the power values. Use the new EM table which is protected with RCU. Align the code so that this RCU read section is short. This change is not expected to alter the general functionality. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- drivers/thermal/devfreq_cooling.c | 49 +++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 9 deletions(-) diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c index 262e62ab6cf2..50dec24e967a 100644 --- a/drivers/thermal/devfreq_cooling.c +++ b/drivers/thermal/devfreq_cooling.c @@ -87,6 +87,7 @@ static int devfreq_cooling_set_cur_state(struct thermal_cooling_device *cdev, struct devfreq_cooling_device *dfc = cdev->devdata; struct devfreq *df = dfc->devfreq; struct device *dev = df->dev.parent; + struct em_perf_state *table; unsigned long freq; int perf_idx; @@ -100,7 +101,11 @@ static int devfreq_cooling_set_cur_state(struct thermal_cooling_device *cdev, if (dfc->em_pd) { perf_idx = dfc->max_state - state; - freq = dfc->em_pd->table[perf_idx].frequency * 1000; + + rcu_read_lock(); + table = em_perf_state_from_pd(dfc->em_pd); + freq = table[perf_idx].frequency * 1000; + rcu_read_unlock(); } else { freq = dfc->freq_table[state]; } @@ -123,14 +128,21 @@ static int devfreq_cooling_set_cur_state(struct thermal_cooling_device *cdev, */ static int get_perf_idx(struct em_perf_domain *em_pd, unsigned long freq) { - int i; + struct em_perf_state *table; + int i, idx = -EINVAL; + rcu_read_lock(); + table = em_perf_state_from_pd(em_pd); for (i = 0; i < em_pd->nr_perf_states; i++) { - if (em_pd->table[i].frequency == freq) - return i; + if (table[i].frequency != freq) + continue; + + idx = i; + break; } + rcu_read_unlock(); - return -EINVAL; + return idx; } static unsigned long get_voltage(struct devfreq *df, unsigned long freq) @@ -181,6 +193,7 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd struct devfreq_cooling_device *dfc = cdev->devdata; struct devfreq *df = dfc->devfreq; struct devfreq_dev_status status; + struct em_perf_state *table; unsigned long state; unsigned long freq; unsigned long voltage; @@ -204,7 +217,11 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd state = dfc->capped_state; /* Convert EM power into milli-Watts first */ - dfc->res_util = dfc->em_pd->table[state].power; + rcu_read_lock(); + table = em_perf_state_from_pd(dfc->em_pd); + dfc->res_util = table[state].power; + rcu_read_unlock(); + dfc->res_util /= MICROWATT_PER_MILLIWATT; dfc->res_util *= SCALE_ERROR_MITIGATION; @@ -225,7 +242,11 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd _normalize_load(&status); /* Convert EM power into milli-Watts first */ - *power = dfc->em_pd->table[perf_idx].power; + rcu_read_lock(); + table = em_perf_state_from_pd(dfc->em_pd); + *power = table[perf_idx].power; + rcu_read_unlock(); + *power /= MICROWATT_PER_MILLIWATT; /* Scale power for utilization */ *power *= status.busy_time; @@ -245,13 +266,19 @@ static int devfreq_cooling_state2power(struct thermal_cooling_device *cdev, unsigned long state, u32 *power) { struct devfreq_cooling_device *dfc = cdev->devdata; + struct em_perf_state *table; int perf_idx; if (state > dfc->max_state) return -EINVAL; perf_idx = dfc->max_state - state; - *power = dfc->em_pd->table[perf_idx].power; + + rcu_read_lock(); + table = em_perf_state_from_pd(dfc->em_pd); + *power = table[perf_idx].power; + rcu_read_unlock(); + *power /= MICROWATT_PER_MILLIWATT; return 0; @@ -264,6 +291,7 @@ static int devfreq_cooling_power2state(struct thermal_cooling_device *cdev, struct devfreq *df = dfc->devfreq; struct devfreq_dev_status status; unsigned long freq, em_power_mw; + struct em_perf_state *table; s32 est_power; int i; @@ -288,13 +316,16 @@ static int devfreq_cooling_power2state(struct thermal_cooling_device *cdev, * Find the first cooling state that is within the power * budget. The EM power table is sorted ascending. */ + rcu_read_lock(); + table = em_perf_state_from_pd(dfc->em_pd); for (i = dfc->max_state; i > 0; i--) { /* Convert EM power to milli-Watts to make safe comparison */ - em_power_mw = dfc->em_pd->table[i].power; + em_power_mw = table[i].power; em_power_mw /= MICROWATT_PER_MILLIWATT; if (est_power >= em_power_mw) break; } + rcu_read_unlock(); *state = dfc->max_state - i; dfc->capped_state = *state; From patchwork Thu Feb 8 11:55:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771143 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0B1AA7F48B; Thu, 8 Feb 2024 11:57:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393430; cv=none; b=HXVKF0RVldHK28FWfYJVPga+SotaGaEP+byAmnGQNM9A1KpmxAQ3YjdnBinbo5P1FkaH9xNG+liOwTvmRSTI96PDyKkRnAZy4yz9nJR5MWqvBU4SP2PRg1SnttTLzyj22tIiGW5uEZTYgpXpf/1pTmezShkz7BZTdp9AgrXecIs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393430; c=relaxed/simple; bh=lzHdLn3RnU4KvLq9L4yLPglARNGzesbjMWjFLLMt1MQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fA0unZ3SnH+1hM/ZCfv25flhFrNcmIm0qKc50LpXBDNrMHphJ3qZC709MUkJ/Lhs+K6Y1DjgWdHavKAA5vH7hNeAa81NtufH3qljQNsoQWXiufDp14UyqivdXH8ZjPjOVrMsx8Qi2wC/Gl1aOUfTOjBl7aPLxgSS/gPTN2FRVOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B0C31FB; Thu, 8 Feb 2024 03:57:50 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id BCACA3F5A1; Thu, 8 Feb 2024 03:57:05 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 21/23] PM: EM: Remove old table Date: Thu, 8 Feb 2024 11:55:55 +0000 Message-Id: <20240208115557.1273962-22-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Remove the old EM table which wasn't able to modify the data. Clean the unneeded function and refactor the code a bit. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- include/linux/energy_model.h | 2 -- kernel/power/energy_model.c | 46 ++++++------------------------------ 2 files changed, 7 insertions(+), 41 deletions(-) diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h index aabfc26fcd31..92866a81abe4 100644 --- a/include/linux/energy_model.h +++ b/include/linux/energy_model.h @@ -53,7 +53,6 @@ struct em_perf_table { /** * struct em_perf_domain - Performance domain - * @table: List of performance states, in ascending order * @em_table: Pointer to the runtime modifiable em_perf_table * @nr_perf_states: Number of performance states * @flags: See "em_perf_domain flags" @@ -69,7 +68,6 @@ struct em_perf_table { * field is unused. */ struct em_perf_domain { - struct em_perf_state *table; struct em_perf_table __rcu *em_table; int nr_perf_states; unsigned long flags; diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c index 548908e686ed..57838d28af85 100644 --- a/kernel/power/energy_model.c +++ b/kernel/power/energy_model.c @@ -276,17 +276,6 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table, return 0; } -static int em_allocate_perf_table(struct em_perf_domain *pd, - int nr_states) -{ - pd->table = kcalloc(nr_states, sizeof(struct em_perf_state), - GFP_KERNEL); - if (!pd->table) - return -ENOMEM; - - return 0; -} - /** * em_dev_update_perf_domain() - Update runtime EM table for a device * @dev : Device for which the EM is to be updated @@ -331,24 +320,6 @@ int em_dev_update_perf_domain(struct device *dev, } EXPORT_SYMBOL_GPL(em_dev_update_perf_domain); -static int em_create_runtime_table(struct em_perf_domain *pd) -{ - struct em_perf_table __rcu *table; - int table_size; - - table = em_table_alloc(pd); - if (!table) - return -ENOMEM; - - /* Initialize runtime table with existing data */ - table_size = sizeof(struct em_perf_state) * pd->nr_perf_states; - memcpy(table->state, pd->table, table_size); - - rcu_assign_pointer(pd->em_table, table); - - return 0; -} - static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd, struct em_perf_state *table, struct em_data_callback *cb, @@ -409,6 +380,7 @@ static int em_create_pd(struct device *dev, int nr_states, struct em_data_callback *cb, cpumask_t *cpus, unsigned long flags) { + struct em_perf_table __rcu *em_table; struct em_perf_domain *pd; struct device *cpu_dev; int cpu, ret, num_cpus; @@ -435,17 +407,15 @@ static int em_create_pd(struct device *dev, int nr_states, pd->nr_perf_states = nr_states; - ret = em_allocate_perf_table(pd, nr_states); - if (ret) + em_table = em_table_alloc(pd); + if (!em_table) goto free_pd; - ret = em_create_perf_table(dev, pd, pd->table, cb, flags); + ret = em_create_perf_table(dev, pd, em_table->state, cb, flags); if (ret) goto free_pd_table; - ret = em_create_runtime_table(pd); - if (ret) - goto free_pd_table; + rcu_assign_pointer(pd->em_table, em_table); if (_is_cpu_device(dev)) for_each_cpu(cpu, cpus) { @@ -458,7 +428,7 @@ static int em_create_pd(struct device *dev, int nr_states, return 0; free_pd_table: - kfree(pd->table); + kfree(em_table); free_pd: kfree(pd); return -EINVAL; @@ -629,7 +599,7 @@ int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states, dev->em_pd->flags |= flags; - em_cpufreq_update_efficiencies(dev, dev->em_pd->table); + em_cpufreq_update_efficiencies(dev, dev->em_pd->em_table->state); em_debug_create_pd(dev); dev_info(dev, "EM: created perf domain\n"); @@ -666,8 +636,6 @@ void em_dev_unregister_perf_domain(struct device *dev) mutex_lock(&em_pd_mutex); em_debug_remove_pd(dev); - kfree(dev->em_pd->table); - em_table_free(dev->em_pd->em_table); kfree(dev->em_pd); From patchwork Thu Feb 8 11:55:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukasz Luba X-Patchwork-Id: 771142 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 07AD280C05; Thu, 8 Feb 2024 11:57:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393437; cv=none; b=kQw+dbNy81eAz+//O9CWXiwMTu9zb6X8nr5euZgOPCBDI4nLNZUtGWO7IeLey/zQFVWi07HEZs9yv7Xvgu5ib9Z/EhQLDSkDFwyuhbtv3KIHtSFxga+bMv/uUtHArS/SYe1zp8zMUNWO0coctLpMss2Yo4qzKZTDMMdyZFBLkUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707393437; c=relaxed/simple; bh=mOT08XEByO+LeskfYMHlZjEPu2b3L6qYci807q41tVw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=C9sEMfRqKqAQJgjEbFTc2YnfWq1ZRsl4Vqz3XddNyIE1JUX5UUpOjLjKxxlrxOg6tUS6C+DkN1RDpzYarSNKopC5lgtca+xAnNzRENH8KlUKmr6iLDBUGF5Ml9GtfvM+QmjBeRVrlWJ17jBhHfoIOch1WUnuf/7J5HT7d8C5iOA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8C8BBDA7; Thu, 8 Feb 2024 03:57:56 -0800 (PST) Received: from e129166.arm.com (unknown [10.57.8.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AD0A93F5A1; Thu, 8 Feb 2024 03:57:11 -0800 (PST) From: Lukasz Luba To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, rafael@kernel.org Cc: lukasz.luba@arm.com, dietmar.eggemann@arm.com, rui.zhang@intel.com, amit.kucheria@verdurent.com, amit.kachhap@gmail.com, daniel.lezcano@linaro.org, viresh.kumar@linaro.org, len.brown@intel.com, pavel@ucw.cz, mhiramat@kernel.org, qyousef@layalina.io, wvw@google.com, xuewen.yan94@gmail.com Subject: [PATCH v8 23/23] Documentation: EM: Update with runtime modification design Date: Thu, 8 Feb 2024 11:55:57 +0000 Message-Id: <20240208115557.1273962-24-lukasz.luba@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240208115557.1273962-1-lukasz.luba@arm.com> References: <20240208115557.1273962-1-lukasz.luba@arm.com> Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add a new description which covers the information about runtime EM. It contains the design decisions, describes models and how they reflect the reality. Remove description of the default EM. Add example driver code which modifies EM. Add API documentation for the new feature which allows to modify the EM in runtime. Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Signed-off-by: Lukasz Luba --- Documentation/power/energy-model.rst | 183 ++++++++++++++++++++++++++- 1 file changed, 179 insertions(+), 4 deletions(-) diff --git a/Documentation/power/energy-model.rst b/Documentation/power/energy-model.rst index 13225965c9a4..ada4938c37e5 100644 --- a/Documentation/power/energy-model.rst +++ b/Documentation/power/energy-model.rst @@ -71,6 +71,31 @@ whose performance is scaled together. Performance domains generally have a required to have the same micro-architecture. CPUs in different performance domains can have different micro-architectures. +To better reflect power variation due to static power (leakage) the EM +supports runtime modifications of the power values. The mechanism relies on +RCU to free the modifiable EM perf_state table memory. Its user, the task +scheduler, also uses RCU to access this memory. The EM framework provides +API for allocating/freeing the new memory for the modifiable EM table. +The old memory is freed automatically using RCU callback mechanism when there +are no owners anymore for the given EM runtime table instance. This is tracked +using kref mechanism. The device driver which provided the new EM at runtime, +should call EM API to free it safely when it's no longer needed. The EM +framework will handle the clean-up when it's possible. + +The kernel code which want to modify the EM values is protected from concurrent +access using a mutex. Therefore, the device driver code must run in sleeping +context when it tries to modify the EM. + +With the runtime modifiable EM we switch from a 'single and during the entire +runtime static EM' (system property) design to a 'single EM which can be +changed during runtime according e.g. to the workload' (system and workload +property) design. + +It is possible also to modify the CPU performance values for each EM's +performance state. Thus, the full power and performance profile (which +is an exponential curve) can be changed according e.g. to the workload +or system property. + 2. Core APIs ------------ @@ -175,10 +200,82 @@ CPUfreq governor is in use in case of CPU device. Currently this calculation is not provided for other type of devices. More details about the above APIs can be found in ```` -or in Section 2.4 +or in Section 2.5 + + +2.4 Runtime modifications +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Drivers willing to update the EM at runtime should use the following dedicated +function to allocate a new instance of the modified EM. The API is listed +below:: + + struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd); + +This allows to allocate a structure which contains the new EM table with +also RCU and kref needed by the EM framework. The 'struct em_perf_table' +contains array 'struct em_perf_state state[]' which is a list of performance +states in ascending order. That list must be populated by the device driver +which wants to update the EM. The list of frequencies can be taken from +existing EM (created during boot). The content in the 'struct em_perf_state' +must be populated by the driver as well. + +This is the API which does the EM update, using RCU pointers swap:: + + int em_dev_update_perf_domain(struct device *dev, + struct em_perf_table __rcu *new_table); + +Drivers must provide a pointer to the allocated and initialized new EM +'struct em_perf_table'. That new EM will be safely used inside the EM framework +and will be visible to other sub-systems in the kernel (thermal, powercap). +The main design goal for this API is to be fast and avoid extra calculations +or memory allocations at runtime. When pre-computed EMs are available in the +device driver, than it should be possible to simply re-use them with low +performance overhead. + +In order to free the EM, provided earlier by the driver (e.g. when the module +is unloaded), there is a need to call the API:: + + void em_table_free(struct em_perf_table __rcu *table); + +It will allow the EM framework to safely remove the memory, when there is +no other sub-system using it, e.g. EAS. + +To use the power values in other sub-systems (like thermal, powercap) there is +a need to call API which protects the reader and provide consistency of the EM +table data:: + + struct em_perf_state *em_perf_state_from_pd(struct em_perf_domain *pd); + +It returns the 'struct em_perf_state' pointer which is an array of performance +states in ascending order. +This function must be called in the RCU read lock section (after the +rcu_read_lock()). When the EM table is not needed anymore there is a need to +call rcu_real_unlock(). In this way the EM safely uses the RCU read section +and protects the users. It also allows the EM framework to manage the memory +and free it. More details how to use it can be found in Section 3.2 in the +example driver. + +There is dedicated API for device drivers to calculate em_perf_state::cost +values:: + + int em_dev_compute_costs(struct device *dev, struct em_perf_state *table, + int nr_states); + +These 'cost' values from EM are used in EAS. The new EM table should be passed +together with the number of entries and device pointer. When the computation +of the cost values is done properly the return value from the function is 0. +The function takes care for right setting of inefficiency for each performance +state as well. It updates em_perf_state::flags accordingly. +Then such prepared new EM can be passed to the em_dev_update_perf_domain() +function, which will allow to use it. + +More details about the above APIs can be found in ```` +or in Section 3.2 with an example code showing simple implementation of the +updating mechanism in a device driver. -2.4 Description details of this API +2.5 Description details of this API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. kernel-doc:: include/linux/energy_model.h :internal: @@ -187,8 +284,11 @@ or in Section 2.4 :export: -3. Example driver ------------------ +3. Examples +----------- + +3.1 Example driver with EM registration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The CPUFreq framework supports dedicated callback for registering the EM for a given CPU(s) 'policy' object: cpufreq_driver::register_em(). @@ -242,3 +342,78 @@ EM framework:: 39 static struct cpufreq_driver foo_cpufreq_driver = { 40 .register_em = foo_cpufreq_register_em, 41 }; + + +3.2 Example driver with EM modification +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This section provides a simple example of a thermal driver modifying the EM. +The driver implements a foo_thermal_em_update() function. The driver is woken +up periodically to check the temperature and modify the EM data:: + + -> drivers/soc/example/example_em_mod.c + + 01 static void foo_get_new_em(struct foo_context *ctx) + 02 { + 03 struct em_perf_table __rcu *em_table; + 04 struct em_perf_state *table, *new_table; + 05 struct device *dev = ctx->dev; + 06 struct em_perf_domain *pd; + 07 unsigned long freq; + 08 int i, ret; + 09 + 10 pd = em_pd_get(dev); + 11 if (!pd) + 12 return; + 13 + 14 em_table = em_table_alloc(pd); + 15 if (!em_table) + 16 return; + 17 + 18 new_table = em_table->state; + 19 + 20 rcu_read_lock(); + 21 table = em_perf_state_from_pd(pd); + 22 for (i = 0; i < pd->nr_perf_states; i++) { + 23 freq = table[i].frequency; + 24 foo_get_power_perf_values(dev, freq, &new_table[i]); + 25 } + 26 rcu_read_unlock(); + 27 + 28 /* Calculate 'cost' values for EAS */ + 29 ret = em_dev_compute_costs(dev, table, pd->nr_perf_states); + 30 if (ret) { + 31 dev_warn(dev, "EM: compute costs failed %d\n", ret); + 32 em_free_table(em_table); + 33 return; + 34 } + 35 + 36 ret = em_dev_update_perf_domain(dev, em_table); + 37 if (ret) { + 38 dev_warn(dev, "EM: update failed %d\n", ret); + 39 em_free_table(em_table); + 40 return; + 41 } + 42 + 43 /* + 44 * Since it's one-time-update drop the usage counter. + 45 * The EM framework will later free the table when needed. + 46 */ + 47 em_table_free(em_table); + 48 } + 49 + 50 /* + 51 * Function called periodically to check the temperature and + 52 * update the EM if needed + 53 */ + 54 static void foo_thermal_em_update(struct foo_context *ctx) + 55 { + 56 struct device *dev = ctx->dev; + 57 int cpu; + 58 + 59 ctx->temperature = foo_get_temp(dev, ctx); + 60 if (ctx->temperature < FOO_EM_UPDATE_TEMP_THRESHOLD) + 61 return; + 62 + 63 foo_get_new_em(ctx); + 64 }