diff mbox series

[V4,3/4] pmdomain: core: Fix debugfs node creation failure

Message ID 20241023102148.1698910-4-quic_sibis@quicinc.com
State New
Headers show
Series firmware: arm_scmi: Misc Fixes | expand

Commit Message

Sibi Sankar Oct. 23, 2024, 10:21 a.m. UTC
The domain attributes returned by the perf protocol can end up
reporting identical names across domains, resulting in debugfs
node creation failure. Fix this failure by ensuring that pm domains
get a unique name using ida in pm_genpd_init.

Logs: [X1E reports 'NCC' for all its scmi perf domains]
debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
debugfs: Directory 'NCC' with parent 'pm_genpd' already present!

Reported-by: Johan Hovold <johan+linaro@kernel.org>
Closes: https://lore.kernel.org/lkml/ZoQjAWse2YxwyRJv@hovoldconsulting.com/
Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---

v3:
* Update device names only when a name collision occurs [Dmitry/Ulf]
* Drop Johan's T-b from "fix debugfs node creation failure"

 drivers/pmdomain/core.c   | 65 ++++++++++++++++++++++++++++++---------
 include/linux/pm_domain.h |  1 +
 2 files changed, 51 insertions(+), 15 deletions(-)

Comments

Johan Hovold Oct. 25, 2024, 1:53 p.m. UTC | #1
On Wed, Oct 23, 2024 at 03:51:47PM +0530, Sibi Sankar wrote:
> The domain attributes returned by the perf protocol can end up
> reporting identical names across domains, resulting in debugfs
> node creation failure. Fix this failure by ensuring that pm domains
> get a unique name using ida in pm_genpd_init.
> 
> Logs: [X1E reports 'NCC' for all its scmi perf domains]
> debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
> debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
> 
> Reported-by: Johan Hovold <johan+linaro@kernel.org>
> Closes: https://lore.kernel.org/lkml/ZoQjAWse2YxwyRJv@hovoldconsulting.com/
> Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
> Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
> 
> v3:
> * Update device names only when a name collision occurs [Dmitry/Ulf]
> * Drop Johan's T-b from "fix debugfs node creation failure"

Also seems to do the trick:

Tested-by: Johan Hovold <johan+linaro@kernel.org>

But perhaps you could consider starting enumerating the duplicate
domains from 2 (or 1) instead of 0?:

NCC_1                           on                              0
NCC_0                           on                              0
NCC                             on                              0

Johan
Sibi Sankar Oct. 25, 2024, 2:06 p.m. UTC | #2
On 10/25/24 19:23, Johan Hovold wrote:
> On Wed, Oct 23, 2024 at 03:51:47PM +0530, Sibi Sankar wrote:
>> The domain attributes returned by the perf protocol can end up
>> reporting identical names across domains, resulting in debugfs
>> node creation failure. Fix this failure by ensuring that pm domains
>> get a unique name using ida in pm_genpd_init.
>>
>> Logs: [X1E reports 'NCC' for all its scmi perf domains]
>> debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
>> debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
>>
>> Reported-by: Johan Hovold <johan+linaro@kernel.org>
>> Closes: https://lore.kernel.org/lkml/ZoQjAWse2YxwyRJv@hovoldconsulting.com/
>> Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains")
>> Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>>
>> v3:
>> * Update device names only when a name collision occurs [Dmitry/Ulf]
>> * Drop Johan's T-b from "fix debugfs node creation failure"
> 
> Also seems to do the trick:
> 
> Tested-by: Johan Hovold <johan+linaro@kernel.org>
> 
> But perhaps you could consider starting enumerating the duplicate
> domains from 2 (or 1) instead of 0?:
> 
> NCC_1                           on                              0
> NCC_0                           on                              0
> NCC                             on                              0

We are just trying to make sure node names are unique and
can't ensure the pd-name correctness since ida starts its
number generation from 0 and I didn't want to shape the
fix just to cater to our specific case. The firmware fix
will be in charge of ensuring pd-name correctness.

-Sibi

> 
> Johan
Johan Hovold Oct. 25, 2024, 2:11 p.m. UTC | #3
On Fri, Oct 25, 2024 at 07:36:16PM +0530, Sibi Sankar wrote:
> On 10/25/24 19:23, Johan Hovold wrote:

> > Also seems to do the trick:
> > 
> > Tested-by: Johan Hovold <johan+linaro@kernel.org>
> > 
> > But perhaps you could consider starting enumerating the duplicate
> > domains from 2 (or 1) instead of 0?:
> > 
> > NCC_1                           on                              0
> > NCC_0                           on                              0
> > NCC                             on                              0
> 
> We are just trying to make sure node names are unique and
> can't ensure the pd-name correctness since ida starts its
> number generation from 0 and I didn't want to shape the
> fix just to cater to our specific case. The firmware fix
> will be in charge of ensuring pd-name correctness.

Ah, it's a global number space? I didn't really look at the
implementation...

Johan
Sibi Sankar Oct. 30, 2024, 12:52 p.m. UTC | #4
On 10/25/24 19:41, Johan Hovold wrote:
> On Fri, Oct 25, 2024 at 07:36:16PM +0530, Sibi Sankar wrote:
>> On 10/25/24 19:23, Johan Hovold wrote:
> 
>>> Also seems to do the trick:
>>>
>>> Tested-by: Johan Hovold <johan+linaro@kernel.org>
>>>
>>> But perhaps you could consider starting enumerating the duplicate
>>> domains from 2 (or 1) instead of 0?:
>>>
>>> NCC_1                           on                              0
>>> NCC_0                           on                              0
>>> NCC                             on                              0
>>
>> We are just trying to make sure node names are unique and
>> can't ensure the pd-name correctness since ida starts its
>> number generation from 0 and I didn't want to shape the
>> fix just to cater to our specific case. The firmware fix
>> will be in charge of ensuring pd-name correctness.
> 
> Ah, it's a global number space? I didn't really look at the
> implementation...

Thanks for testing out the patch. Yes it was a global number
space but we are changing the implementation again in the next
re-spin. Please try that out instead.

-Sibi

> 
> Johan
diff mbox series

Patch

diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index 76490f0bf1e2..756ed0975788 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -7,6 +7,7 @@ 
 #define pr_fmt(fmt) "PM: " fmt
 
 #include <linux/delay.h>
+#include <linux/idr.h>
 #include <linux/kernel.h>
 #include <linux/io.h>
 #include <linux/platform_device.h>
@@ -23,6 +24,9 @@ 
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
 
+/* Provides a unique ID for each genpd device */
+static DEFINE_IDA(genpd_ida);
+
 #define GENPD_RETRY_MAX_MS	250		/* Approximate */
 
 #define GENPD_DEV_CALLBACK(genpd, type, callback, dev)		\
@@ -189,7 +193,7 @@  static inline bool irq_safe_dev_in_sleep_domain(struct device *dev,
 
 	if (ret)
 		dev_warn_once(dev, "PM domain %s will not be powered off\n",
-				genpd->name);
+			      dev_name(&genpd->dev));
 
 	return ret;
 }
@@ -274,7 +278,7 @@  static void genpd_debug_remove(struct generic_pm_domain *genpd)
 	if (!genpd_debugfs_dir)
 		return;
 
-	debugfs_lookup_and_remove(genpd->name, genpd_debugfs_dir);
+	debugfs_lookup_and_remove(dev_name(&genpd->dev), genpd_debugfs_dir);
 }
 
 static void genpd_update_accounting(struct generic_pm_domain *genpd)
@@ -731,7 +735,7 @@  static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed)
 	genpd->states[state_idx].power_on_latency_ns = elapsed_ns;
 	genpd->gd->max_off_time_changed = true;
 	pr_debug("%s: Power-%s latency exceeded, new value %lld ns\n",
-		 genpd->name, "on", elapsed_ns);
+		 dev_name(&genpd->dev), "on", elapsed_ns);
 
 out:
 	raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_ON, NULL);
@@ -782,7 +786,7 @@  static int _genpd_power_off(struct generic_pm_domain *genpd, bool timed)
 	genpd->states[state_idx].power_off_latency_ns = elapsed_ns;
 	genpd->gd->max_off_time_changed = true;
 	pr_debug("%s: Power-%s latency exceeded, new value %lld ns\n",
-		 genpd->name, "off", elapsed_ns);
+		 dev_name(&genpd->dev), "off", elapsed_ns);
 
 out:
 	raw_notifier_call_chain(&genpd->power_notifiers, GENPD_NOTIFY_OFF,
@@ -1941,7 +1945,7 @@  int dev_pm_genpd_add_notifier(struct device *dev, struct notifier_block *nb)
 
 	if (ret) {
 		dev_warn(dev, "failed to add notifier for PM domain %s\n",
-			 genpd->name);
+			 dev_name(&genpd->dev));
 		return ret;
 	}
 
@@ -1988,7 +1992,7 @@  int dev_pm_genpd_remove_notifier(struct device *dev)
 
 	if (ret) {
 		dev_warn(dev, "failed to remove notifier for PM domain %s\n",
-			 genpd->name);
+			 dev_name(&genpd->dev));
 		return ret;
 	}
 
@@ -2014,7 +2018,7 @@  static int genpd_add_subdomain(struct generic_pm_domain *genpd,
 	 */
 	if (!genpd_is_irq_safe(genpd) && genpd_is_irq_safe(subdomain)) {
 		WARN(1, "Parent %s of subdomain %s must be IRQ safe\n",
-				genpd->name, subdomain->name);
+		     dev_name(&genpd->dev), subdomain->name);
 		return -EINVAL;
 	}
 
@@ -2089,7 +2093,7 @@  int pm_genpd_remove_subdomain(struct generic_pm_domain *genpd,
 
 	if (!list_empty(&subdomain->parent_links) || subdomain->device_count) {
 		pr_warn("%s: unable to remove subdomain %s\n",
-			genpd->name, subdomain->name);
+			dev_name(&genpd->dev), subdomain->name);
 		ret = -EBUSY;
 		goto out;
 	}
@@ -2199,6 +2203,23 @@  static void genpd_lock_init(struct generic_pm_domain *genpd)
 	}
 }
 
+static bool genpd_name_present(const char *name)
+{
+	bool ret = false;
+	const struct generic_pm_domain *gpd;
+
+	mutex_lock(&gpd_list_lock);
+	list_for_each_entry(gpd, &gpd_list, gpd_list_node) {
+		if (!strcmp(dev_name(&gpd->dev), name)) {
+			ret = true;
+			break;
+		}
+	}
+	mutex_unlock(&gpd_list_lock);
+
+	return ret;
+}
+
 /**
  * pm_genpd_init - Initialize a generic I/O PM domain object.
  * @genpd: PM domain object to initialize.
@@ -2226,6 +2247,7 @@  int pm_genpd_init(struct generic_pm_domain *genpd,
 	genpd->status = is_off ? GENPD_STATE_OFF : GENPD_STATE_ON;
 	genpd->device_count = 0;
 	genpd->provider = NULL;
+	genpd->device_id = -ENXIO;
 	genpd->has_provider = false;
 	genpd->accounting_time = ktime_get_mono_fast_ns();
 	genpd->domain.ops.runtime_suspend = genpd_runtime_suspend;
@@ -2266,7 +2288,18 @@  int pm_genpd_init(struct generic_pm_domain *genpd,
 		return ret;
 
 	device_initialize(&genpd->dev);
-	dev_set_name(&genpd->dev, "%s", genpd->name);
+
+	if (!genpd_name_present(genpd->name)) {
+		dev_set_name(&genpd->dev, "%s", genpd->name);
+	} else {
+		ret = ida_alloc(&genpd_ida, GFP_KERNEL);
+		if (ret < 0) {
+			put_device(&genpd->dev);
+			return ret;
+		}
+		genpd->device_id = ret;
+		dev_set_name(&genpd->dev, "%s_%u", genpd->name, genpd->device_id);
+	}
 
 	mutex_lock(&gpd_list_lock);
 	list_add(&genpd->gpd_list_node, &gpd_list);
@@ -2288,13 +2321,13 @@  static int genpd_remove(struct generic_pm_domain *genpd)
 
 	if (genpd->has_provider) {
 		genpd_unlock(genpd);
-		pr_err("Provider present, unable to remove %s\n", genpd->name);
+		pr_err("Provider present, unable to remove %s\n", dev_name(&genpd->dev));
 		return -EBUSY;
 	}
 
 	if (!list_empty(&genpd->parent_links) || genpd->device_count) {
 		genpd_unlock(genpd);
-		pr_err("%s: unable to remove %s\n", __func__, genpd->name);
+		pr_err("%s: unable to remove %s\n", __func__, dev_name(&genpd->dev));
 		return -EBUSY;
 	}
 
@@ -2308,9 +2341,11 @@  static int genpd_remove(struct generic_pm_domain *genpd)
 	genpd_unlock(genpd);
 	genpd_debug_remove(genpd);
 	cancel_work_sync(&genpd->power_off_work);
+	if (genpd->device_id != -ENXIO)
+		ida_free(&genpd_ida, genpd->device_id);
 	genpd_free_data(genpd);
 
-	pr_debug("%s: removed %s\n", __func__, genpd->name);
+	pr_debug("%s: removed %s\n", __func__, dev_name(&genpd->dev));
 
 	return 0;
 }
@@ -3320,12 +3355,12 @@  static int genpd_summary_one(struct seq_file *s,
 	else
 		snprintf(state, sizeof(state), "%s",
 			 status_lookup[genpd->status]);
-	seq_printf(s, "%-30s  %-30s  %u", genpd->name, state, genpd->performance_state);
+	seq_printf(s, "%-30s  %-30s  %u", dev_name(&genpd->dev), state, genpd->performance_state);
 
 	/*
 	 * Modifications on the list require holding locks on both
 	 * parent and child, so we are safe.
-	 * Also genpd->name is immutable.
+	 * Also the device name is immutable.
 	 */
 	list_for_each_entry(link, &genpd->parent_links, parent_node) {
 		if (list_is_first(&link->parent_node, &genpd->parent_links))
@@ -3550,7 +3585,7 @@  static void genpd_debug_add(struct generic_pm_domain *genpd)
 	if (!genpd_debugfs_dir)
 		return;
 
-	d = debugfs_create_dir(genpd->name, genpd_debugfs_dir);
+	d = debugfs_create_dir(dev_name(&genpd->dev), genpd_debugfs_dir);
 
 	debugfs_create_file("current_state", 0444,
 			    d, genpd, &status_fops);
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 76775ab38898..1106b86de279 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -171,6 +171,7 @@  struct generic_pm_domain {
 	atomic_t sd_count;	/* Number of subdomains with power "on" */
 	enum gpd_status status;	/* Current state of the domain */
 	unsigned int device_count;	/* Number of devices */
+	unsigned int device_id;		/* unique device id */
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
 	unsigned int performance_state;	/* Aggregated max performance state */