From patchwork Mon Feb 17 20:39:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 866033 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E5F5924111E for ; Mon, 17 Feb 2025 20:40:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824861; cv=none; b=JK12aDMKbJoFOEpqZ47v3KYKx/RTdRXpDdwrekmQWDuPX4Jd4EufE/tr6vC/8eBitzzPvIwBW/jqXPYCKCxFW3uotk1aKR9Ye+4HJGPTL8BbZ/j/n8+VSu1Eqstr0MvWpU7vBKgw+h2BBAGNPFTcoZ9vnrOC0o1InZ0dYLl//jc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824861; c=relaxed/simple; bh=gaPSjZVx/Yimb+0L5jpSeFIvuSJksgLRI7SDPB3SfWU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LZIiGWhKRUpoa4WUuuaBpNfZi8ULv45FNFSxdGAq2QUPb/xbNnJF1/pVJfNbq2zc/JW2TXyms6FwLaJZs2YJ5elOGwPKoTPbWJcvvvPAbLjMvQwIbXvCDaJQ94c1fJrGfMnIC7OJ6+lBdq15VlZ4kieM8PI4Bg7BHIJ4jFrG3I8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tk7uZ-0007oI-CW; Mon, 17 Feb 2025 21:40:15 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tk7uX-001TFv-2y; Mon, 17 Feb 2025 21:40:13 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tk7uX-000W9t-2P; Mon, 17 Feb 2025 21:40:13 +0100 From: Ahmad Fatoum Date: Mon, 17 Feb 2025 21:39:42 +0100 Subject: [PATCH v3 02/12] reboot: reboot, not shutdown, on hw_protection_reboot timeout Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250217-hw_protection-reboot-v3-2-e1c09b090c0c@pengutronix.de> References: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> In-Reply-To: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> To: Andrew Morton , Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org hw_protection_shutdown() will kick off an orderly shutdown and if that takes longer than a configurable amount of time, an emergency shutdown will occur. Recently, hw_protection_reboot() was added for those systems that don't implement a proper shutdown and are better served by rebooting and having the boot firmware worry about doing something about the critical condition. On timeout of the orderly reboot of hw_protection_reboot(), the system would go into shutdown, instead of reboot. This is not a good idea, as going into shutdown was explicitly not asked for. Fix this by always doing an emergency reboot if hw_protection_reboot() is called and the orderly reboot takes too long. Fixes: 79fa723ba84c ("reboot: Introduce thermal_zone_device_critical_reboot()") Reviewed-by: Tzung-Bi Shih Reviewed-by: Matti Vaittinen Signed-off-by: Ahmad Fatoum --- kernel/reboot.c | 70 ++++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 49 insertions(+), 21 deletions(-) diff --git a/kernel/reboot.c b/kernel/reboot.c index b20b53f08648d88bac533ab18ea66396b44a3045..f348f1ba9e22675ac1183149ba19f39be12edacd 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -932,48 +932,76 @@ void orderly_reboot(void) } EXPORT_SYMBOL_GPL(orderly_reboot); +static const char *hw_protection_action_str(enum hw_protection_action action) +{ + switch (action) { + case HWPROT_ACT_SHUTDOWN: + return "shutdown"; + case HWPROT_ACT_REBOOT: + return "reboot"; + default: + return "undefined"; + } +} + +static enum hw_protection_action hw_failure_emergency_action; + /** - * hw_failure_emergency_poweroff_func - emergency poweroff work after a known delay - * @work: work_struct associated with the emergency poweroff function + * hw_failure_emergency_action_func - emergency action work after a known delay + * @work: work_struct associated with the emergency action function * * This function is called in very critical situations to force - * a kernel poweroff after a configurable timeout value. + * a kernel poweroff or reboot after a configurable timeout value. */ -static void hw_failure_emergency_poweroff_func(struct work_struct *work) +static void hw_failure_emergency_action_func(struct work_struct *work) { + const char *action_str = hw_protection_action_str(hw_failure_emergency_action); + + pr_emerg("Hardware protection timed-out. Trying forced %s\n", + action_str); + /* - * We have reached here after the emergency shutdown waiting period has - * expired. This means orderly_poweroff has not been able to shut off - * the system for some reason. + * We have reached here after the emergency action waiting period has + * expired. This means orderly_poweroff/reboot has not been able to + * shut off the system for some reason. * - * Try to shut down the system immediately using kernel_power_off - * if populated + * Try to shut off the system immediately if possible */ - pr_emerg("Hardware protection timed-out. Trying forced poweroff\n"); - kernel_power_off(); + + if (hw_failure_emergency_action == HWPROT_ACT_REBOOT) + kernel_restart(NULL); + else + kernel_power_off(); /* * Worst of the worst case trigger emergency restart */ - pr_emerg("Hardware protection shutdown failed. Trying emergency restart\n"); + pr_emerg("Hardware protection %s failed. Trying emergency restart\n", + action_str); emergency_restart(); } -static DECLARE_DELAYED_WORK(hw_failure_emergency_poweroff_work, - hw_failure_emergency_poweroff_func); +static DECLARE_DELAYED_WORK(hw_failure_emergency_action_work, + hw_failure_emergency_action_func); /** - * hw_failure_emergency_poweroff - Trigger an emergency system poweroff + * hw_failure_emergency_schedule - Schedule an emergency system shutdown or reboot + * + * @action: The hardware protection action to be taken + * @action_delay_ms: Time in milliseconds to elapse before triggering action * * This may be called from any critical situation to trigger a system shutdown - * after a given period of time. If time is negative this is not scheduled. + * or reboot after a given period of time. + * If time is negative this is not scheduled. */ -static void hw_failure_emergency_poweroff(int poweroff_delay_ms) +static void hw_failure_emergency_schedule(enum hw_protection_action action, + int action_delay_ms) { - if (poweroff_delay_ms <= 0) + if (action_delay_ms <= 0) return; - schedule_delayed_work(&hw_failure_emergency_poweroff_work, - msecs_to_jiffies(poweroff_delay_ms)); + hw_failure_emergency_action = action; + schedule_delayed_work(&hw_failure_emergency_action_work, + msecs_to_jiffies(action_delay_ms)); } /** @@ -1006,7 +1034,7 @@ void __hw_protection_shutdown(const char *reason, int ms_until_forced, * Queue a backup emergency shutdown in the event of * orderly_poweroff failure */ - hw_failure_emergency_poweroff(ms_until_forced); + hw_failure_emergency_schedule(action, ms_until_forced); if (action == HWPROT_ACT_REBOOT) orderly_reboot(); else From patchwork Mon Feb 17 20:39:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 866036 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E99823C8D1 for ; Mon, 17 Feb 2025 20:40:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824846; cv=none; b=GGkRUDERm232mUYNdxDlRdptCf/6DYBUcsmR74lvK8QtR/uFLYLRp2buwEqi8ULOHoZIUMQFjLO78A0R4j2x0GVDQxDnmzfE5KFaBSYFkWe/RNKJhqWwPqpKUJRxGfsCfIZ41z3svOw96yHmmBLN54Apht6ttgXWHg9C+d4QdOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824846; c=relaxed/simple; bh=EF+OIGuQjAmSVRDImJKX1B6EES2YYainDrFd5e7QM38=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Mk5SmH96yyLf3hHcdJNxIdTCm3OlCq3seqRVg/Dldl+k2fGvQ3uNjC6rvAmfB/fKxQlcznebU+8aKCHsyNOCoBfhYP/t8RPjAX2nMwc8OlK5QmUBbNjZddIh3TPLKClnqkBeW07i/W9tFTTF2+jYTcz8+Z56gA8bZKkbyZrzAFw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tk7uZ-0007oK-CW; Mon, 17 Feb 2025 21:40:15 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tk7uX-001TFx-32; Mon, 17 Feb 2025 21:40:13 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tk7uX-000W9t-2R; Mon, 17 Feb 2025 21:40:13 +0100 From: Ahmad Fatoum Date: Mon, 17 Feb 2025 21:39:44 +0100 Subject: [PATCH v3 04/12] reboot: describe do_kernel_restart's cmd argument in kernel-doc Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250217-hw_protection-reboot-v3-4-e1c09b090c0c@pengutronix.de> References: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> In-Reply-To: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> To: Andrew Morton , Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org A W=1 build rightfully complains about the function's kernel-doc being incomplete. Describe its single parameter to fix this. Reviewed-by: Tzung-Bi Shih Signed-off-by: Ahmad Fatoum --- kernel/reboot.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/reboot.c b/kernel/reboot.c index f348f1ba9e22675ac1183149ba19f39be12edacd..6185cfe5d4ee910daf057884a7ff8dcf1e80df28 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -229,6 +229,9 @@ EXPORT_SYMBOL(unregister_restart_handler); /** * do_kernel_restart - Execute kernel restart handler call chain * + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * * Calls functions registered with register_restart_handler. * * Expected to be called from machine_restart as last step of the restart From patchwork Mon Feb 17 20:39:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 866037 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F94323AE79 for ; Mon, 17 Feb 2025 20:40:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824842; cv=none; b=WhJfy/wK04QoSbPBJjZGiIbKuiXPko645gi53nZCeNR1nP1+pNCEDWYCusPkI0zvNiYwUVj+UpjG/6hYypNIZaQx/jUbN49kRWnCtY3FltTJvcUDtS0y3J6b3JfyGxw2dKa4DhA2B6FHUCB4XD8KksHifpdU04SIWosWn0DxIso= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824842; c=relaxed/simple; bh=bKC8AmJcuvMW7uyiuu1YmwEZnOSGkkhO9MmtZ/Z3/5g=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=nttZog4OoM2yOcuIrFAyumEsNQ8aJDMbKXiX8kOQBkbaWqiNHaNeg9T4eXgHMspnRGQujkPEM1u5mnU2wVFQHtrJerRx8BJaokzAj4w4WBLUDd1ubEko5lgCsy6AokoVRkJPjo6t7RMvb868luOM0fpE7y0JIV/H8caJLKwgw/o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tk7uZ-0007oG-CW; Mon, 17 Feb 2025 21:40:15 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tk7uX-001TFz-2v; Mon, 17 Feb 2025 21:40:13 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tk7uX-000W9t-2T; Mon, 17 Feb 2025 21:40:13 +0100 From: Ahmad Fatoum Date: Mon, 17 Feb 2025 21:39:46 +0100 Subject: [PATCH v3 06/12] reboot: indicate whether it is a HARDWARE PROTECTION reboot or shutdown Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250217-hw_protection-reboot-v3-6-e1c09b090c0c@pengutronix.de> References: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> In-Reply-To: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> To: Andrew Morton , Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org It currently depends on the caller, whether we attempt a hardware protection shutdown (poweroff) or a reboot. A follow-up commit will make this partially user-configurable, so it's a good idea to have the emergency message clearly state whether the kernel is going for a reboot or a shutdown. Reviewed-by: Tzung-Bi Shih Signed-off-by: Ahmad Fatoum --- kernel/reboot.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/reboot.c b/kernel/reboot.c index c1f11d5e085e4d2fffc841a624c8b650aba276b8..faf1ff422634d19ef96c59b74dd4bf94d96af592 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -1027,7 +1027,8 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, { static atomic_t allow_proceed = ATOMIC_INIT(1); - pr_emerg("HARDWARE PROTECTION shutdown (%s)\n", reason); + pr_emerg("HARDWARE PROTECTION %s (%s)\n", + hw_protection_action_str(action), reason); /* Shutdown should be initiated only once. */ if (!atomic_dec_and_test(&allow_proceed)) From patchwork Mon Feb 17 20:39:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 866035 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E61CF23ED69 for ; Mon, 17 Feb 2025 20:40:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824849; cv=none; b=C7qdXpFQAKlgCl5uhtkzk44VTlu4na5+3Viz18Cbh8bsvdOHxXlpak1Z6aLoLRWRgoAH78UcVRgoz9O8c3TQMK91jrVkPvVV4inSwhcA2aeSB7pln3vHgOQqwZZ8xqBqvp55e+fAXvRi01fBE6alIYOwzPhPVtekBsWYd8Z96lE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739824849; c=relaxed/simple; bh=GWvVAoEibDzB/Cff27FKgYLi/GrA8O0lRFgcY9ucCFQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=YUY5X54tERT0twa3UHs/GvCS7LfQy2a9SKBANXS/crEeYpI+u6qB+i4MWJs2KEhrLsS1AEDe62IyBikueMyU5ubFBbIEg7WmWBxixa88smtPaqK6JFmlWZeQG0BxqkQZXgrN1H05inJaZxvaM6rfHTXVFjDagaGsM08UvCxkEBQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tk7uZ-0007oM-CW; Mon, 17 Feb 2025 21:40:15 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tk7uX-001TG0-33; Mon, 17 Feb 2025 21:40:13 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tk7uX-000W9t-2U; Mon, 17 Feb 2025 21:40:13 +0100 From: Ahmad Fatoum Date: Mon, 17 Feb 2025 21:39:47 +0100 Subject: [PATCH v3 07/12] reboot: add support for configuring emergency hardware protection action Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250217-hw_protection-reboot-v3-7-e1c09b090c0c@pengutronix.de> References: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> In-Reply-To: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> To: Andrew Morton , Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum , Matteo Croce X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org We currently leave the decision of whether to shutdown or reboot to protect hardware in an emergency situation to the individual drivers. This works out in some cases, where the driver detecting the critical failure has inside knowledge: It binds to the system management controller for example or is guided by hardware description that defines what to do. In the general case, however, the driver detecting the issue can't know what the appropriate course of action is and shouldn't be dictating the policy of dealing with it. Therefore, add a global hw_protection toggle that allows the user to specify whether shutdown or reboot should be the default action when the driver doesn't set policy. This introduces no functional change yet as hw_protection_trigger() has no callers, but these will be added in subsequent commits. Reviewed-by: Tzung-Bi Shih Signed-off-by: Ahmad Fatoum --- Documentation/ABI/testing/sysfs-kernel-reboot | 8 +++++ Documentation/admin-guide/kernel-parameters.txt | 6 ++++ include/linux/reboot.h | 22 +++++++++++- include/uapi/linux/capability.h | 1 + kernel/reboot.c | 46 +++++++++++++++++++++++++ 5 files changed, 82 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-reboot b/Documentation/ABI/testing/sysfs-kernel-reboot index 837330fb251134ffdf29cd68f0b2a845b088e5a0..e117aba46be0e8d3cdff3abfb678f8847a726122 100644 --- a/Documentation/ABI/testing/sysfs-kernel-reboot +++ b/Documentation/ABI/testing/sysfs-kernel-reboot @@ -30,3 +30,11 @@ KernelVersion: 5.11 Contact: Matteo Croce Description: Don't wait for any other CPUs on reboot and avoid anything that could hang. + +What: /sys/kernel/reboot/hw_protection +Date: April 2025 +KernelVersion: 6.15 +Contact: Ahmad Fatoum +Description: Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + Valid values are: reboot shutdown diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb8752b42ec8582b8750d7e014c4d76166fa2fc1..b2f04967876f6bea0c63e507f65b97f010845585 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1933,6 +1933,12 @@ which allow the hypervisor to 'idle' the guest on lock contention. + hw_protection= [HW] + Format: reboot | shutdown + + Hardware protection action taken on critical events like + overtemperature or imminent voltage loss. + i2c_bus= [HW] Override the default board specific I2C bus speed or register an additional I2C bus that is not registered from board initialization code. diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 53c64e31b3cfdcb6b6dfe4def45fbb40c29f5144..79e02876f2ba2b5508f6f26567cbcd5cbe97a277 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -181,16 +181,36 @@ extern void orderly_reboot(void); /** * enum hw_protection_action - Hardware protection action * + * @HWPROT_ACT_DEFAULT: + * The default action should be taken. This is HWPROT_ACT_SHUTDOWN + * by default, but can be overridden. * @HWPROT_ACT_SHUTDOWN: * The system should be shut down (powered off) for HW protection. * @HWPROT_ACT_REBOOT: * The system should be rebooted for HW protection. */ -enum hw_protection_action { HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; +enum hw_protection_action { HWPROT_ACT_DEFAULT, HWPROT_ACT_SHUTDOWN, HWPROT_ACT_REBOOT }; void __hw_protection_trigger(const char *reason, int ms_until_forced, enum hw_protection_action action); +/** + * hw_protection_trigger - Trigger default emergency system hardware protection action + * + * @reason: Reason of emergency shutdown or reboot to be printed. + * @ms_until_forced: Time to wait for orderly shutdown or reboot before + * triggering it. Negative value disables the forced + * shutdown or reboot. + * + * Initiate an emergency system shutdown or reboot in order to protect + * hardware from further damage. The exact action taken is controllable at + * runtime and defaults to shutdown. + */ +static inline void hw_protection_trigger(const char *reason, int ms_until_forced) +{ + __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_DEFAULT); +} + static inline void hw_protection_reboot(const char *reason, int ms_until_forced) { __hw_protection_trigger(reason, ms_until_forced, HWPROT_ACT_REBOOT); diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h index 5bb9060986974726025eaabee24a0b720ff94657..2e21b5594f81313e8e17aeeb98a09f098355515f 100644 --- a/include/uapi/linux/capability.h +++ b/include/uapi/linux/capability.h @@ -275,6 +275,7 @@ struct vfs_ns_cap_data { /* Allow setting encryption key on loopback filesystem */ /* Allow setting zone reclaim policy */ /* Allow everything under CAP_BPF and CAP_PERFMON for backward compatibility */ +/* Allow setting hardware protection emergency action */ #define CAP_SYS_ADMIN 21 diff --git a/kernel/reboot.c b/kernel/reboot.c index faf1ff422634d19ef96c59b74dd4bf94d96af592..5299790a2832d07a55c9d38502763b58d53f927b 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -36,6 +36,8 @@ enum reboot_mode reboot_mode DEFAULT_REBOOT_MODE; EXPORT_SYMBOL_GPL(reboot_mode); enum reboot_mode panic_reboot_mode = REBOOT_UNDEFINED; +static enum hw_protection_action hw_protection_action = HWPROT_ACT_SHUTDOWN; + /* * This variable is used privately to keep track of whether or not * reboot_type is still set to its default value (i.e., reboot= hasn't @@ -1027,6 +1029,9 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, { static atomic_t allow_proceed = ATOMIC_INIT(1); + if (action == HWPROT_ACT_DEFAULT) + action = hw_protection_action; + pr_emerg("HARDWARE PROTECTION %s (%s)\n", hw_protection_action_str(action), reason); @@ -1046,6 +1051,46 @@ void __hw_protection_trigger(const char *reason, int ms_until_forced, } EXPORT_SYMBOL_GPL(__hw_protection_trigger); +static bool hw_protection_action_parse(const char *str, + enum hw_protection_action *action) +{ + if (sysfs_streq(str, "shutdown")) + *action = HWPROT_ACT_SHUTDOWN; + else if (sysfs_streq(str, "reboot")) + *action = HWPROT_ACT_REBOOT; + else + return false; + + return true; +} + +static int __init hw_protection_setup(char *str) +{ + hw_protection_action_parse(str, &hw_protection_action); + return 1; +} +__setup("hw_protection=", hw_protection_setup); + +static ssize_t hw_protection_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%s\n", + hw_protection_action_str(hw_protection_action)); +} +static ssize_t hw_protection_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, + size_t count) +{ + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (!hw_protection_action_parse(buf, &hw_protection_action)) + return -EINVAL; + + return count; +} +static struct kobj_attribute hw_protection_attr = __ATTR_RW(hw_protection); + static int __init reboot_setup(char *str) { for (;;) { @@ -1305,6 +1350,7 @@ static struct kobj_attribute reboot_cpu_attr = __ATTR_RW(cpu); #endif static struct attribute *reboot_attrs[] = { + &hw_protection_attr.attr, &reboot_mode_attr.attr, #ifdef CONFIG_X86 &reboot_force_attr.attr, From patchwork Mon Feb 17 20:39:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ahmad Fatoum X-Patchwork-Id: 866031 Received: from metis.whiteo.stw.pengutronix.de (metis.whiteo.stw.pengutronix.de [185.203.201.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03E0423AE67 for ; Mon, 17 Feb 2025 20:56:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.203.201.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739825771; cv=none; b=sev7g8qGY1uht3qco14XldG4gVLiJF1HM5zQmD5GKrf0cvBDDHpp4MaUj16an5R6FMa+j3uG6sEqHKEZwsTR9l1S3fwa5hiS5X1ttmBU7wiJL8pYxNUp2aoXgGxnGe7/dmLR5P+4fQbESH2uOr4sEH8MKq2Pbxy5S3ybSizzAOU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739825771; c=relaxed/simple; bh=bIet0sSt0htpcx1JHEcCUyCD45/LA4jWGaGLBnQOx88=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=qQw6oKPMgR4YQgMHEVlyC+ZQYcDqYLT7saPyE7v6ruQarx7PF6Pci66T65mKZtStRjyJmy1pDihTvaYe9++j+07o2umF6o9OZa6eMibPwaUuAlaZL5+PGJrGSZjpOgjoPpd39zoMM0lSmR93kUbkPr3YGCVmjQn2GU9Xs5k5oew= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de; spf=pass smtp.mailfrom=pengutronix.de; arc=none smtp.client-ip=185.203.201.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pengutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pengutronix.de Received: from drehscheibe.grey.stw.pengutronix.de ([2a0a:edc0:0:c01:1d::a2]) by metis.whiteo.stw.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tk89U-00028i-TO; Mon, 17 Feb 2025 21:55:40 +0100 Received: from dude05.red.stw.pengutronix.de ([2a0a:edc0:0:1101:1d::54]) by drehscheibe.grey.stw.pengutronix.de with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tk89S-001THd-1i; Mon, 17 Feb 2025 21:55:38 +0100 Received: from localhost ([::1] helo=dude05.red.stw.pengutronix.de) by dude05.red.stw.pengutronix.de with esmtp (Exim 4.96) (envelope-from ) id 1tk7uX-000W9t-2X; Mon, 17 Feb 2025 21:40:13 +0100 From: Ahmad Fatoum Date: Mon, 17 Feb 2025 21:39:50 +0100 Subject: [PATCH v3 10/12] dt-bindings: thermal: give OS some leeway in absence of critical-action Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250217-hw_protection-reboot-v3-10-e1c09b090c0c@pengutronix.de> References: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> In-Reply-To: <20250217-hw_protection-reboot-v3-0-e1c09b090c0c@pengutronix.de> To: Andrew Morton , Daniel Lezcano , Fabio Estevam , "Rafael J. Wysocki" , Zhang Rui , Lukasz Luba , Jonathan Corbet , Serge Hallyn , Liam Girdwood , Mark Brown , Matti Vaittinen , Benson Leung , Tzung-Bi Shih , Guenter Roeck , Rob Herring , Krzysztof Kozlowski , Conor Dooley Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-doc@vger.kernel.org, linux-security-module@vger.kernel.org, chrome-platform@lists.linux.dev, devicetree@vger.kernel.org, kernel@pengutronix.de, Ahmad Fatoum X-Mailer: b4 0.14.2 X-SA-Exim-Connect-IP: 2a0a:edc0:0:c01:1d::a2 X-SA-Exim-Mail-From: a.fatoum@pengutronix.de X-SA-Exim-Scanned: No (on metis.whiteo.stw.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-pm@vger.kernel.org An operating system may allow its user to configure the action to be undertaken on critical overtemperature events. However, the bindings currently mandate an absence of the critical-action property to be equal to critical-action = "shutdown", which would mean any differing user configuration would violate the bindings. Resolve this by documenting the absence of the property to mean that the OS gets to decide. Acked-by: Rob Herring (Arm) Signed-off-by: Ahmad Fatoum --- Documentation/devicetree/bindings/thermal/thermal-zones.yaml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml index 0f435be1dbd8cfb4502be9d198ed6d51058f453b..0de0a9757ccc201ebbb0c8c8efb9f8da662f8e9c 100644 --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml @@ -82,9 +82,8 @@ patternProperties: $ref: /schemas/types.yaml#/definitions/string description: | The action the OS should perform after the critical temperature is reached. - By default the system will shutdown as a safe action to prevent damage - to the hardware, if the property is not set. - The shutdown action should be always the default and preferred one. + If the property is not set, it is up to the system to select the correct + action. The recommended and preferred default is shutdown. Choose 'reboot' with care, as the hardware may be in thermal stress, thus leading to infinite reboots that may cause damage to the hardware. Make sure the firmware/bootloader will act as the last resort and take