Message ID | 20220225064815.444571-1-shawnguo@kernel.org |
---|---|
State | New |
Headers | show |
Series | PM: domains: Fix sleep-in-atomic bug caused by genpd_debug_remove() | expand |
On Fri, 25 Feb 2022 at 07:48, Shawn Guo <shawnguo@kernel.org> wrote: > > From: Shawn Guo <shawn.guo@linaro.org> > > When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following > sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called > with a spinlock being held. > > [ 0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460 > [ 0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 > [ 0.029219] preempt_count: 1, expected: 0 > [ 0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489 > [ 0.029245] Hardware name: Thundercomm TurboX CM2290 (DT) > [ 0.029256] Call trace: > [ 0.029265] dump_backtrace.part.0+0xbc/0xd0 > [ 0.029285] show_stack+0x3c/0xa0 > [ 0.029298] dump_stack_lvl+0x7c/0xa0 > [ 0.029311] dump_stack+0x18/0x34 > [ 0.029323] __might_resched+0x10c/0x13c > [ 0.029338] __might_sleep+0x4c/0x80 > [ 0.029351] down_read+0x24/0xd0 > [ 0.029363] lookup_one_len_unlocked+0x9c/0xcc > [ 0.029379] lookup_positive_unlocked+0x10/0x50 > [ 0.029392] debugfs_lookup+0x68/0xac > [ 0.029406] genpd_remove.part.0+0x12c/0x1b4 > [ 0.029419] of_genpd_remove_last+0xa8/0xd4 > [ 0.029434] psci_cpuidle_domain_probe+0x174/0x53c > [ 0.029449] platform_probe+0x68/0xe0 > [ 0.029462] really_probe+0x190/0x430 > [ 0.029473] __driver_probe_device+0x90/0x18c > [ 0.029485] driver_probe_device+0x40/0xe0 > [ 0.029497] __driver_attach+0xf4/0x1d0 > [ 0.029508] bus_for_each_dev+0x70/0xd0 > [ 0.029523] driver_attach+0x24/0x30 > [ 0.029534] bus_add_driver+0x164/0x22c > [ 0.029545] driver_register+0x78/0x130 > [ 0.029556] __platform_driver_register+0x28/0x34 > [ 0.029569] psci_idle_init_domains+0x1c/0x28 > [ 0.029583] do_one_initcall+0x50/0x1b0 > [ 0.029595] kernel_init_freeable+0x214/0x280 > [ 0.029609] kernel_init+0x2c/0x13c > [ 0.029622] ret_from_fork+0x10/0x20 > > It doesn't seem necessary to call genpd_debug_remove() with the lock, so > move it out from locking to fix the problem. > > Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains") > Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Thanks for fixing this! Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Rafael, I think we should tag this for stable kernels too. Kind regards Uffe > --- > drivers/base/power/domain.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 5db704f02e71..7e8039d1884c 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -2058,9 +2058,9 @@ static int genpd_remove(struct generic_pm_domain *genpd) > kfree(link); > } > > - genpd_debug_remove(genpd); > list_del(&genpd->gpd_list_node); > genpd_unlock(genpd); > + genpd_debug_remove(genpd); > cancel_work_sync(&genpd->power_off_work); > if (genpd_is_cpu_domain(genpd)) > free_cpumask_var(genpd->cpus); > -- > 2.25.1 >
On Tue, Mar 1, 2022 at 11:38 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Fri, 25 Feb 2022 at 07:48, Shawn Guo <shawnguo@kernel.org> wrote: > > > > From: Shawn Guo <shawn.guo@linaro.org> > > > > When a genpd with GENPD_FLAG_IRQ_SAFE gets removed, the following > > sleep-in-atomic bug will be seen, as genpd_debug_remove() will be called > > with a spinlock being held. > > > > [ 0.029183] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1460 > > [ 0.029204] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1, name: swapper/0 > > [ 0.029219] preempt_count: 1, expected: 0 > > [ 0.029230] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc4+ #489 > > [ 0.029245] Hardware name: Thundercomm TurboX CM2290 (DT) > > [ 0.029256] Call trace: > > [ 0.029265] dump_backtrace.part.0+0xbc/0xd0 > > [ 0.029285] show_stack+0x3c/0xa0 > > [ 0.029298] dump_stack_lvl+0x7c/0xa0 > > [ 0.029311] dump_stack+0x18/0x34 > > [ 0.029323] __might_resched+0x10c/0x13c > > [ 0.029338] __might_sleep+0x4c/0x80 > > [ 0.029351] down_read+0x24/0xd0 > > [ 0.029363] lookup_one_len_unlocked+0x9c/0xcc > > [ 0.029379] lookup_positive_unlocked+0x10/0x50 > > [ 0.029392] debugfs_lookup+0x68/0xac > > [ 0.029406] genpd_remove.part.0+0x12c/0x1b4 > > [ 0.029419] of_genpd_remove_last+0xa8/0xd4 > > [ 0.029434] psci_cpuidle_domain_probe+0x174/0x53c > > [ 0.029449] platform_probe+0x68/0xe0 > > [ 0.029462] really_probe+0x190/0x430 > > [ 0.029473] __driver_probe_device+0x90/0x18c > > [ 0.029485] driver_probe_device+0x40/0xe0 > > [ 0.029497] __driver_attach+0xf4/0x1d0 > > [ 0.029508] bus_for_each_dev+0x70/0xd0 > > [ 0.029523] driver_attach+0x24/0x30 > > [ 0.029534] bus_add_driver+0x164/0x22c > > [ 0.029545] driver_register+0x78/0x130 > > [ 0.029556] __platform_driver_register+0x28/0x34 > > [ 0.029569] psci_idle_init_domains+0x1c/0x28 > > [ 0.029583] do_one_initcall+0x50/0x1b0 > > [ 0.029595] kernel_init_freeable+0x214/0x280 > > [ 0.029609] kernel_init+0x2c/0x13c > > [ 0.029622] ret_from_fork+0x10/0x20 > > > > It doesn't seem necessary to call genpd_debug_remove() with the lock, so > > move it out from locking to fix the problem. > > > > Fixes: 718072ceb211 ("PM: domains: create debugfs nodes when adding power domains") > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org> > > Thanks for fixing this! > > Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Applied as 5.18 material. > Rafael, I think we should tag this for stable kernels too. Done. Thanks!
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 5db704f02e71..7e8039d1884c 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -2058,9 +2058,9 @@ static int genpd_remove(struct generic_pm_domain *genpd) kfree(link); } - genpd_debug_remove(genpd); list_del(&genpd->gpd_list_node); genpd_unlock(genpd); + genpd_debug_remove(genpd); cancel_work_sync(&genpd->power_off_work); if (genpd_is_cpu_domain(genpd)) free_cpumask_var(genpd->cpus);