From patchwork Thu Dec 19 07:31:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 852280 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F366D218AAF for ; Thu, 19 Dec 2024 07:32:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593546; cv=none; b=KE6xTjuUKdU2dDIDKDwV3e8ytjjTIqojAvjxkIlVZrUdbpmntoW1MaebMzk0LSkFwsR4RnDscw/T+ZWshfKKe7AL0mQohPX+T2h6vOtSHGnKoX+aF+lSjRDBzTy/GjDLEvW2f7tjPw2s0v7hiZoWL7dRFn5oFolX288KnAQ6Hfw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593546; c=relaxed/simple; bh=nnySu/T6vxGoHdqmXCcTqun4NtEMjOGvzUrN/x/++lM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VUWwuqyj3IftVPWdAM6ezA+iyy6K12XqW804esnyf6xl8/dxjRkhJ2XJJIoDKcw1pluQyasUFrDRU3Zgn5DxfMNr4FkgquDISDuL0LtEiAzYPbpyAiJkFjGDx9wV8wxtrFPN46uc6/Qlsj+yNJvBAz8ro21Sv7mBKLJcRZYu6tg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tOB0O-00088W-1f; Thu, 19 Dec 2024 08:31:32 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tOB0L-004APr-25; Thu, 19 Dec 2024 08:31:30 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tOB0M-00GkbH-0n; Thu, 19 Dec 2024 08:31:30 +0100 From: Ahmad Fatoum Date: Thu, 19 Dec 2024 08:31:25 +0100 Subject: [PATCH 04/11] reboot: rename now misleading hw_protection symbols Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241219-hw_protection-reboot-v1-4-263a0c1df802@pengutronix.de> References: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> In-Reply-To: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> To: Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org The __hw_protection_shutdown, hw_failure_emergency_poweroff_work and hw_failure_emergency_poweroff_func symbol names have become misleading, because they can either cause a shutdown (poweroff) or a reboot depending on an argument or a global variable. To avoid further confusion, let's rename them, so they don't suggest that a poweroff is all they can do. Signed-off-by: Ahmad Fatoum --- include/linux/reboot.h | 8 ++++---- kernel/reboot.c | 22 +++++++++++----------- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index d6780fbf51535e1f98b576da0a06701402dfd447..b1e2c86d29a281abbcfe69bc00321df185c32c91 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -180,17 +180,17 @@ extern void orderly_reboot(void); enum hw_protection_action { HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; -void __hw_protection_shutdown(const char *reason, int ms_until_forced, - enum hw_protection_action action); +void __hw_protection_trigger(const char *reason, int ms_until_forced, + enum hw_protection_action action); static inline void hw_protection_reboot(const char *reason, int ms_until_forced) { - __hw_protection_shutdown(reason, ms_until_forced, HWPROT_ACT_REBOOT); + __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_REBOOT); } static inline void hw_protection_shutdown(const char *reason, int ms_until_forced) { - __hw_protection_shutdown(reason, ms_until_forced, HWPROT_ACT_SHUTDOWN); + __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_SHUTDOWN); } /* diff --git a/kernel/reboot.c b/kernel/reboot.c index 8e3680d36654587b57db44806a3d7b0228b10f67..da6c8bdeeefe627a76c7ec6e8926138ebbe3ae4e 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -947,13 +947,13 @@ static const char *hw_protection_action_str(enum hw_protection_action action) static enum hw_protection_action hw_failure_emergency_action; /** - * hw_failure_emergency_poweroff_func - emergency poweroff work after a known delay - * @work: work_struct associated with the emergency poweroff function + * hw_failure_emergency_action_func - emergency action after a known delay + * @work: work_struct associated with the emergency action function * * This function is called in very critical situations to force - * a kernel poweroff after a configurable timeout value. + * a kernel poweroff or reboot after a configurable timeout value. */ -static void hw_failure_emergency_poweroff_func(struct work_struct *work) +static void hw_failure_emergency_action_func(struct work_struct *work) { const char *action_str = hw_protection_action_str(hw_failure_emergency_action); @@ -981,8 +981,8 @@ static void hw_failure_emergency_poweroff_func(struct work_struct *work) emergency_restart(); } -static DECLARE_DELAYED_WORK(hw_failure_emergency_poweroff_work, - hw_failure_emergency_poweroff_func); +static DECLARE_DELAYED_WORK(hw_failure_emergency_action_work, + hw_failure_emergency_action_func); /** * hw_failure_emergency_schedule - Schedule an emergency system shutdown or reboot @@ -996,12 +996,12 @@ static void hw_failure_emergency_schedule(enum hw_protection_action action, if (poweroff_delay_ms <= 0) return; hw_failure_emergency_action = action; - schedule_delayed_work(&hw_failure_emergency_poweroff_work, + schedule_delayed_work(&hw_failure_emergency_action_work, msecs_to_jiffies(poweroff_delay_ms)); } /** - * __hw_protection_shutdown - Trigger an emergency system shutdown or reboot + * __hw_protection_trigger - Trigger an emergency system shutdown or reboot * * @reason: Reason of emergency shutdown or reboot to be printed. * @ms_until_forced: Time to wait for orderly shutdown or reboot before @@ -1018,8 +1018,8 @@ static void hw_failure_emergency_schedule(enum hw_protection_action action, * pending even if the previous request has given a large timeout for forced * shutdown/reboot. */ -void __hw_protection_shutdown(const char *reason, int ms_until_forced, - enum hw_protection_action action) +void __hw_protection_trigger(const char *reason, int ms_until_forced, + enum hw_protection_action action) { static atomic_t allow_proceed = ATOMIC_INIT(1); @@ -1039,7 +1039,7 @@ void __hw_protection_shutdown(const char *reason, int ms_until_forced, else orderly_poweroff(true); } -EXPORT_SYMBOL_GPL(__hw_protection_shutdown); +EXPORT_SYMBOL_GPL(__hw_protection_trigger); static int __init reboot_setup(char *str) { From patchwork Thu Dec 19 07:31:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 852284 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED65E1B4233 for ; Thu, 19 Dec 2024 07:31:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593513; cv=none; b=muxHPkDQgzmAme12nC/JlkvYEHC/2/1lP6jHtSygRjlEvMMv5jT0hk0ocLZKauCdC/nmCA9UqTEr/MBElNPxoRF/4y6up/c3Pj3TysRS0WhxIc5nlRNkhlj699o7p47B52FP/QPUa/cN2GBZTYOBxxgd2y3acOnaPWeFalruP4o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593513; c=relaxed/simple; bh=Dbh2Kpv3bXLQD+GsqfTMp3KLQ2xI6YShyiyhisXZNj0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=s8jSvsySnd0wRjN3SFoVRi2Gi8K66c5JspV42N8WJfUZzULu4EfK+ejFSyMOB6XvzbOnGxYV1klFBOEYGzU15rtQy+gt65cgfbLhykvM4xMLnVGDE07zRWH+f3IL4a6hORpqWLLuFV9brKRWbw+R1Sd+NLnOddWrqdieFR+jAdU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tOB0O-00088U-1U; Thu, 19 Dec 2024 08:31:32 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tOB0L-004APs-23; Thu, 19 Dec 2024 08:31:30 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tOB0M-00GkbH-0o; Thu, 19 Dec 2024 08:31:30 +0100 From: Ahmad Fatoum Date: Thu, 19 Dec 2024 08:31:26 +0100 Subject: [PATCH 05/11] reboot: indicate whether it is a HARDWARE PROTECTION reboot or shutdown Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241219-hw_protection-reboot-v1-5-263a0c1df802@pengutronix.de> References: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> In-Reply-To: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> To: Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org It currently depends on the caller, whether we attempt a hardware protection shutdown (poweroff) or a reboot. A follow-up commit will make this partially user-configurable, so it's a good idea to have the emergency message clearly state whether the kernel is going for a reboot or a shutdown. Signed-off-by: Ahmad Fatoum --- kernel/reboot.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/reboot.c b/kernel/reboot.c index da6c8bdeeefe627a76c7ec6e8926138ebbe3ae4e..aa6317939af41c9730ec5a74b7faf03f7c0f25a7 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -1023,7 +1023,8 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, { static atomic_t allow_proceed = ATOMIC_INIT(1); - pr_emerg("HARDWARE PROTECTION shutdown (%s)\n", reason); + pr_emerg("HARDWARE PROTECTION %s (%s)\n", + hw_protection_action_str(action), reason); /* Shutdown should be initiated only once. */ if (!atomic_dec_and_test(&allow_proceed)) From patchwork Thu Dec 19 07:31:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 852282 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C81A02185B9 for ; Thu, 19 Dec 2024 07:32:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593529; cv=none; b=rMYO/RN5m90mxinzpD1OBxONr3msSTcubvIgpY6ZNe+cIYQ5/KQfqCXrt4YRF3lmZLReKOECmi9l0RFkR6rHcLT7B3qinN33Kpdc/tpUam5ZQjU6IpMmVS+09NtqOEzju3+46GMMM2oG/+bHRuBcrO/Au40TtkF2vJbd2u8fpM8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593529; c=relaxed/simple; bh=HHo8+7k6wjJQsKsAJamXMeH66rzfbDBe1zv14LpyGCs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=cFBoOpfpPsLTvBdgSRt4fTzng02l4+YT8ILQsMUvmgZfxlsIOd7XE99ls9wmiJ1bWmmPZsBb11TxtrkxAmmzRNJL2LCJzc+mi9NDZiwmBrhRCuuR73q0Qb68J7ytRelkdRsWTjwS/jNR/VC4BtquRjBcWGyGcfjbY5nGJPLNMD4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tOB0O-00088b-1Y; Thu, 19 Dec 2024 08:31:32 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tOB0L-004APt-2B; Thu, 19 Dec 2024 08:31:30 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tOB0M-00GkbH-0p; Thu, 19 Dec 2024 08:31:30 +0100 From: Ahmad Fatoum Date: Thu, 19 Dec 2024 08:31:27 +0100 Subject: [PATCH 06/11] reboot: add support for configuring emergency hardware protection action Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241219-hw_protection-reboot-v1-6-263a0c1df802@pengutronix.de> References: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> In-Reply-To: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> To: Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum , Matteo Croce X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org We currently leave the decision of whether to shutdown or reboot to protect hardware in an emergency situation to the individual drivers. This works out in some cases, where the driver detecting the critical failure has inside knowledge: It binds to the system management controller for example or is guided by hardware description that defines what to do. In the general case, however, the driver detecting the issue can't know what the appropriate course of action is and shouldn't be dictating the policy of dealing with it. Therefore, add a global hw_protection toggle that allows the user to specify whether shutdown or reboot should be the default action when the driver doesn't set policy. This introduces no functional change yet as hw_protection_trigger() has no callers, but these will be added in subsequent commits. Signed-off-by: Ahmad Fatoum --- Documentation/ABI/testing/sysfs-kernel-reboot | 8 +++++ Documentation/admin-guide/kernel-parameters.txt | 6 ++++ include/linux/reboot.h | 19 +++++++++- include/uapi/linux/capability.h | 1 + kernel/reboot.c | 46 +++++++++++++++++++++++++ 5 files changed, 79 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-reboot b/Documentation/ABI/testing/sysfs-kernel-reboot index 837330fb251134ffdf29cd68f0b2a845b088e5a0..133f54707d533665c68a5946394540ec50b149e5 100644 --- a/Documentation/ABI/testing/sysfs-kernel-reboot +++ b/Documentation/ABI/testing/sysfs-kernel-reboot @@ -30,3 +30,11 @@ KernelVersion: 5.11 Contact: Matteo Croce Description: Don't wait for any other CPUs on reboot and avoid anything that could hang. + +What: /sys/kernel/reboot/hw_protection +Date: Feb 2025 +KernelVersion: 6.14 +Contact: Ahmad Fatoum +Description: Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + Valid values are: reboot shutdown diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 3872bc6ec49d63772755504966ae70113f24a1db..ff244e6a0e04d2c172825818defd5d94448f8518 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1921,6 +1921,12 @@ which allow the hypervisor to 'idle' the guest on lock contention. + hw_protection= [HW] + Format: reboot | shutdown + + Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + i2c_bus= [HW] Override the default board specific I2C bus speed or register an additional I2C bus that is not registered from board initialization code. diff --git a/include/linux/reboot.h b/include/linux/reboot.h index b1e2c86d29a281abbcfe69bc00321df185c32c91..281696f509932e444eadd453fb0233aa7a07fbce 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -178,11 +178,28 @@ void ctrl_alt_del(void); extern void orderly_poweroff(bool force); extern void orderly_reboot(void); -enum hw_protection_action { HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; +enum hw_protection_action { HWPROT_ACT_DEFAULT, HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; void __hw_protection_trigger(const char *reason, int ms_until_forced, enum hw_protection_action action); +/** + * hw_protection_trigger - Trigger default emergency system hardware protection action + * + * @reason: Reason of emergency shutdown or reboot to be printed. + * @ms_until_forced: Time to wait for orderly shutdown or reboot before + * triggering it. Negative value disables the forced + * shutdown or reboot. + * + * Initiate an emergency system shutdown or reboot in order to protect + * hardware from further damage. The exact action taken is controllable at + * runtime and defaults to shutdown. + */ +static inline void hw_protection_trigger(const char *reason, int ms_until_forced) +{ + __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_DEFAULT); +} + static inline void hw_protection_reboot(const char *reason, int ms_until_forced) { __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_REBOOT); diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h index 5bb9060986974726025eaabee24a0b720ff94657..2e21b5594f81313e8e17aeeb98a09f098355515f 100644 --- a/include/uapi/linux/capability.h +++ b/include/uapi/linux/capability.h @@ -275,6 +275,7 @@ struct vfs_ns_cap_data { /* Allow setting encryption key on loopback filesystem */ /* Allow setting zone reclaim policy */ /* Allow everything under CAP_BPF and CAP_PERFMON for backward compatibility */ +/* Allow setting hardware protection emergency action */ #define CAP_SYS_ADMIN 21 diff --git a/kernel/reboot.c b/kernel/reboot.c index aa6317939af41c9730ec5a74b7faf03f7c0f25a7..08e7e5f00308ae66120688b83771a1b7fc8403cb 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -36,6 +36,8 @@ enum reboot_mode reboot_mode DEFAULT_REBOOT_MODE; EXPORT_SYMBOL_GPL(reboot_mode); enum reboot_mode panic_reboot_mode = REBOOT_UNDEFINED; +static enum hw_protection_action hw_protection_action = HWPROT_ACT_SHUTDOWN; + /* * This variable is used privately to keep track of whether or not * reboot_type is still set to its default value (i.e., reboot= hasn't @@ -1023,6 +1025,9 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, { static atomic_t allow_proceed = ATOMIC_INIT(1); + if (action == HWPROT_ACT_DEFAULT) + action = hw_protection_action; + pr_emerg("HARDWARE PROTECTION %s (%s)\n", hw_protection_action_str(action), reason); @@ -1042,6 +1047,46 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, } EXPORT_SYMBOL_GPL(__hw_protection_trigger); +static bool hw_protection_action_parse(const char *str, + enum hw_protection_action *action) +{ + if (sysfs_streq(str, "shutdown")) + *action = HWPROT_ACT_SHUTDOWN; + else if (sysfs_streq(str, "reboot")) + *action = HWPROT_ACT_REBOOT; + else + return false; + + return true; +} + +static int __init hw_protection_setup(char *str) +{ + hw_protection_action_parse(str, &hw_protection_action); + return 1; +} +__setup("hw_protection=", hw_protection_setup); + +static ssize_t hw_protection_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%s\n", + hw_protection_action_str(hw_protection_action)); +} +static ssize_t hw_protection_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, + size_t count) +{ + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (!hw_protection_action_parse(buf, &hw_protection_action)) + return -EINVAL; + + return count; +} +static struct kobj_attribute hw_protection_attr = __ATTR_RW(hw_protection); + static int __init reboot_setup(char *str) { for (;;) { @@ -1301,6 +1346,7 @@ static struct kobj_attribute reboot_cpu_attr = __ATTR_RW(cpu); #endif static struct attribute *reboot_attrs[] = { + &hw_protection_attr.attr, &reboot_mode_attr.attr, #ifdef CONFIG_X86 &reboot_force_attr.attr, From patchwork Thu Dec 19 07:31:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 852283 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3E31216E0E for ; Thu, 19 Dec 2024 07:31:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593518; cv=none; b=PrjO1lFwk/x2yE+GY2tN1Z1tFDNw+t3x+NDdMn9oSbx7jPma6zNAACdd5dxbjB5a/Avm+FJSH7YbjvwMIX5gW+PQeAeM012BrU/Y0PvGzi4nSbZpLQuem4pHCOVPRIDhbuxrsXQ9Ftp2FiSxZNRxjCWCPVIcl5AtZi4pejkWW8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734593518; c=relaxed/simple; bh=PUPdXfLUsQ8yTAia2r3rOB3zzTzJrVUz+cfOw1iUE6o=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=UYKMnV4u6LGzg2/NXiVytiOc6DmuhXdP6bVikBxXd8DR6x1H5oqZfQxdUnwDjFIbkGUm1W74SuGvJa4B5RCKnbdwAOCpfqDX/eM9vyxN5Q5zSmxLywCKu60dkFgLzJ/8rxJKbOow0LLwPNx0Hf81080HiGif+k1O1RdW66U1q6s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tOB0O-00088Y-1R; Thu, 19 Dec 2024 08:31:32 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tOB0L-004APw-2C; Thu, 19 Dec 2024 08:31:30 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tOB0M-00GkbH-0s; Thu, 19 Dec 2024 08:31:30 +0100 From: Ahmad Fatoum Date: Thu, 19 Dec 2024 08:31:30 +0100 Subject: [PATCH 09/11] dt-bindings: thermal: give OS some leeway in absence of critical-action Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241219-hw_protection-reboot-v1-9-263a0c1df802@pengutronix.de> References: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> In-Reply-To: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> To: Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org An operating system may allow its user to configure the action to be undertaken on critical overtemperature events. However, the bindings currently mandate an absence of the critical-action property to be equal to critical-action = "shutdown", which would mean any differing user configuration would violate the bindings. Resolve this by documenting the absence of the property to mean that the OS gets to decide. Signed-off-by: Ahmad Fatoum --- Documentation/devicetree/bindings/thermal/thermal-zones.yaml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml index 0f435be1dbd8cfb4502be9d198ed6d51058f453b..0de0a9757ccc201ebbb0c8c8efb9f8da662f8e9c 100644 --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml @@ -82,9 +82,8 @@ patternProperties: $ref: /schemas/types.yaml#/definitions/string description: | The action the OS should perform after the critical temperature is reached. - By default the system will shutdown as a safe action to prevent damage - to the hardware, if the property is not set. - The shutdown action should be always the default and preferred one. + If the property is not set, it is up to the system to select the correct + action. The recommended and preferred default is shutdown. Choose 'reboot' with care, as the hardware may be in thermal stress, thus leading to infinite reboots that may cause damage to the hardware. Make sure the firmware/bootloader will act as the last resort and take From patchwork Thu Dec 19 07:31:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 852279 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B9682163A9 for ; Thu, 19 Dec 2024 07:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734594643; cv=none; b=Ptp7YiBPmfYOoGT1T9ibeXlETuhlpu3V5xSpSCYLjPOCMtsI2e9vxhiBhRTGbuhmRPPikh3gMZmiHH4mrhpuWoA6TK4XQEGoXhd7sHXUgiDa5sWVLlJ4GtlWIpXUg+CVHKNs9kAynkKvWCU+uzssgSUbx3InY/Blv2VDLMHTJXo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734594643; c=relaxed/simple; bh=LN/2GKVWP8SVrflImM2f/d6NIlKudTEI+VcMQwEcCKY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=u7m5fduDaD6LGR4kQMeAZh23/JMiy0xHfRhtLt3cOC7vA0DVKcgNGbIpISmgPUr4KD++/j+SMcccTEGX6EVzgyGLrM1hUeWSpxA/ET5AfD2n2GnYmFT8smit0uecKH69W6f6l9zUR43gUbtHYCalj+5WD6BVgnF34An/Ee9Ibc0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tOBId-0003kW-Pr; Thu, 19 Dec 2024 08:50:23 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tOBIc-004AWn-17; Thu, 19 Dec 2024 08:50:23 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tOB0M-00GkbH-0t; Thu, 19 Dec 2024 08:31:30 +0100 From: Ahmad Fatoum Date: Thu, 19 Dec 2024 08:31:31 +0100 Subject: [PATCH 10/11] thermal: core: allow user configuration of hardware protection action Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241219-hw_protection-reboot-v1-10-263a0c1df802@pengutronix.de> References: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> In-Reply-To: <20241219-hw_protection-reboot-v1-0-263a0c1df802@pengutronix.de> To: Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org In the general case, we don't know which of system shutdown or reboot is the better action to take to protect hardware in an emergency situation. We thus allow the policy to come from the device-tree in the form of an optional critical-action OF property, but so far there was no way for the end user to configure this. With recent addition of the hw_protection parameter, the user can now choose a default action for the case, where the driver isn't fully sure what's the better course of action. Let's make use of this by passing HWPROT_ACT_DEFAULT in absence of the critical-action OF property. As HWPROT_ACT_DEFAULT is shutdown by default, this introduces no functional change for users, unless they start using the new parameter. Signed-off-by: Ahmad Fatoum --- drivers/thermal/thermal_core.c | 17 ++++++++++------- drivers/thermal/thermal_core.h | 1 + drivers/thermal/thermal_of.c | 7 +++++-- 3 files changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 19a3894ad752a91ef621794abbeec9abfb2323ec..abe990b7b40b0c7fa5034093b961e239840a18c1 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -369,7 +369,8 @@ void thermal_governor_update_tz(struct thermal_zone_device *tz, tz->governor->update_tz(tz, reason); } -static void thermal_zone_device_halt(struct thermal_zone_device *tz, bool shutdown) +static void thermal_zone_device_halt(struct thermal_zone_device *tz, + enum hw_protection_action action) { /* * poweroff_delay_ms must be a carefully profiled positive value. @@ -380,21 +381,23 @@ static void thermal_zone_device_halt(struct thermal_zone_device *tz, bool shutdo dev_emerg(&tz->device, "%s: critical temperature reached\n", tz->type); - if (shutdown) - hw_protection_shutdown(msg, poweroff_delay_ms); - else - hw_protection_reboot(msg, poweroff_delay_ms); + __hw_protection_trigger(msg, poweroff_delay_ms, action); } void thermal_zone_device_critical(struct thermal_zone_device *tz) { - thermal_zone_device_halt(tz, true); + thermal_zone_device_halt(tz, HWPROT_ACT_DEFAULT); } EXPORT_SYMBOL(thermal_zone_device_critical); +void thermal_zone_device_critical_shutdown(struct thermal_zone_device *tz) +{ + thermal_zone_device_halt(tz, HWPROT_ACT_SHUTDOWN); +} + void thermal_zone_device_critical_reboot(struct thermal_zone_device *tz) { - thermal_zone_device_halt(tz, false); + thermal_zone_device_halt(tz, HWPROT_ACT_REBOOT); } static void handle_critical_trips(struct thermal_zone_device *tz, diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h index be271e7c8f4141146a03efecc82fc4036ec12df6..7d6637126007168ac05010af0f16a4c8012a0d77 100644 --- a/drivers/thermal/thermal_core.h +++ b/drivers/thermal/thermal_core.h @@ -262,6 +262,7 @@ int thermal_build_list_of_policies(char *buf); void __thermal_zone_device_update(struct thermal_zone_device *tz, enum thermal_notify_event event); void thermal_zone_device_critical_reboot(struct thermal_zone_device *tz); +void thermal_zone_device_critical_shutdown(struct thermal_zone_device *tz); void thermal_governor_update_tz(struct thermal_zone_device *tz, enum thermal_notify_event reason); diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index fab11b98ca4952d23d0232998433bd0650b53d24..c574e775d686599deddd08f932a5a6dd781d342e 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -396,9 +396,12 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * of_ops.should_bind = thermal_of_should_bind; ret = of_property_read_string(np, "critical-action", &action); - if (!ret) - if (!of_ops.critical && !strcasecmp(action, "reboot")) + if (!ret && !of_ops.critical) { + if (!strcasecmp(action, "reboot")) of_ops.critical = thermal_zone_device_critical_reboot; + else if (!strcasecmp(action, "shutdown")) + of_ops.critical = thermal_zone_device_critical_shutdown; + } tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips, data, &of_ops, &tzp,