Message ID | 20220920092721.7686-1-git@vladimir.panteleev.md |
---|---|
State | New |
Headers | show |
Series | [v2] watchdog: sp5100_tco: Add "action" module parameter | expand |
On Tue, Sep 20, 2022 at 09:27:21AM +0000, Vladimir Panteleev wrote: > Allow configuring the "action" bit, as documented in [1]. > > Previously, the only action supported by this module was to reset the > system (0). It can now be configured to power off (1) instead. > > [1]: https://www.amd.com/system/files/TechDocs/44413.pdf The referenced AMD document 44413 is over 10 years old, and I'm concerned when I try to line that document up against the newer versions that are implemented in AMD's EPYC processors, that the bit being manipulated as SP5100_WDT_ACTION_RESET is effectively reserved in the newer references, for example: https://www.amd.com/system/files/TechDocs/55772-A1-PUB.zip Is Core::X86::Msr::CpuWdtCfg in the newer document is the same as WatchDogControl in the cited 44413.pdf? If so, then I would point out that bit 2 is now included in what is called, CpuWdtTimeBase where values 2-3H are reserved, meaning bit 2 effectively must be zero.
On 9/22/22 10:36, rwright@hpe.com wrote: > On Tue, Sep 20, 2022 at 09:27:21AM +0000, Vladimir Panteleev wrote: >> Allow configuring the "action" bit, as documented in [1]. >> >> Previously, the only action supported by this module was to reset the >> system (0). It can now be configured to power off (1) instead. >> >> [1]: https://www.amd.com/system/files/TechDocs/44413.pdf > > The referenced AMD document 44413 is over 10 years old, and I'm > concerned when I try to line that document up against the newer versions > that are implemented in AMD's EPYC processors, that the bit being > manipulated as SP5100_WDT_ACTION_RESET is effectively reserved in the > newer references, for example: > > https://www.amd.com/system/files/TechDocs/55772-A1-PUB.zip > > Is Core::X86::Msr::CpuWdtCfg in the newer document is the same as > WatchDogControl in the cited 44413.pdf? If so, then I would point out > that bit 2 is now included in what is called, CpuWdtTimeBase where > values 2-3H are reserved, meaning bit 2 effectively must be zero. > I think those may be different watchdogs. Chapter 9.2.6 ("Watchdog Timer (WDT) Registers") in 55772-A1-PUB still lists bit 2 of the control register as: WatchdogAction. Read-write. Reset: 0. 0=System reset. 1=System power off. This bit determines the action to be taken when the watchdog timer expires. The bit is only valid when the watchdog is enabled. Guenter
On Tue, Sep 20, 2022 at 09:27:21AM +0000, Vladimir Panteleev wrote: > Allow configuring the "action" bit, as documented in [1]. > > Previously, the only action supported by this module was to reset the > system (0). It can now be configured to power off (1) instead. > > [1]: https://www.amd.com/system/files/TechDocs/44413.pdf > > Signed-off-by: Vladimir Panteleev <git@vladimir.panteleev.md> > --- > > Changes since v1: > - Drop the rename of the SP5100_WDT_ACTION_RESET define > - Make the new parameter not visible in sysfs for consistency > > drivers/watchdog/sp5100_tco.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c > index ae54dd33e233..fb426b7d81da 100644 > --- a/drivers/watchdog/sp5100_tco.c > +++ b/drivers/watchdog/sp5100_tco.c > @@ -65,6 +65,12 @@ static struct pci_dev *sp5100_tco_pci; > > /* module parameters */ > > +#define WATCHDOG_ACTION 0 > +static bool action = WATCHDOG_ACTION; > +module_param(action, bool, 0); > +MODULE_PARM_DESC(action, "Action taken when watchdog expires, 0 to reset, 1 to poweroff (default=" > + __MODULE_STRING(WATCHDOG_ACTION) ")"); > + > #define WATCHDOG_HEARTBEAT 60 /* 60 sec default heartbeat. */ > static int heartbeat = WATCHDOG_HEARTBEAT; /* in seconds */ > module_param(heartbeat, int, 0); > @@ -297,8 +303,11 @@ static int sp5100_tco_timer_init(struct sp5100_tco *tco) > if (val & SP5100_WDT_FIRED) > wdd->bootstatus = WDIOF_CARDRESET; > > - /* Set watchdog action to reset the system */ > - val &= ~SP5100_WDT_ACTION_RESET; > + /* Set watchdog action */ > + if (action) > + val |= SP5100_WDT_ACTION_RESET; > + else > + val &= ~SP5100_WDT_ACTION_RESET; > writel(val, SP5100_WDT_CONTROL(tco->tcobase)); > > /* Set a reasonable heartbeat before we stop the timer */ > -- > 2.37.3 Looking at other WDT, I see at least three that have an "action" parameter. These appear to have inconsistent semantics with each other and this patch. Is there something we could to create more uniformity with module level parameters like these? Examples: machzwd.c: MODULE_PARM_DESC(action, "after watchdog resets, generate: " "0 = RESET(*) 1 = SMI 2 = NMI 3 = SCI"); pseries-wdt.c static const unsigned long pseries_wdt_action[] = { [0] = PSERIES_WDTF_ACTION_HARD_POWEROFF, [1] = PSERIES_WDTF_ACTION_HARD_RESTART, [2] = PSERIES_WDTF_ACTION_DUMP_RESTART, sbsa_gwdt.c MODULE_PARM_DESC(action, "after watchdog gets WS0 interrupt, do: " "0 = skip(*) 1 = panic");
On Tue, Sep 20, 2022 at 09:27:21AM +0000, Vladimir Panteleev wrote: > Allow configuring the "action" bit, as documented in [1]. > > Previously, the only action supported by this module was to reset the > system (0). It can now be configured to power off (1) instead. > > [1]: https://www.amd.com/system/files/TechDocs/44413.pdf > > Signed-off-by: Vladimir Panteleev <git@vladimir.panteleev.md> Reviewed-by: Guenter Roeck <linux@roeck-us.net> > --- > > Changes since v1: > - Drop the rename of the SP5100_WDT_ACTION_RESET define > - Make the new parameter not visible in sysfs for consistency > > drivers/watchdog/sp5100_tco.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c > index ae54dd33e233..fb426b7d81da 100644 > --- a/drivers/watchdog/sp5100_tco.c > +++ b/drivers/watchdog/sp5100_tco.c > @@ -65,6 +65,12 @@ static struct pci_dev *sp5100_tco_pci; > > /* module parameters */ > > +#define WATCHDOG_ACTION 0 > +static bool action = WATCHDOG_ACTION; > +module_param(action, bool, 0); > +MODULE_PARM_DESC(action, "Action taken when watchdog expires, 0 to reset, 1 to poweroff (default=" > + __MODULE_STRING(WATCHDOG_ACTION) ")"); > + > #define WATCHDOG_HEARTBEAT 60 /* 60 sec default heartbeat. */ > static int heartbeat = WATCHDOG_HEARTBEAT; /* in seconds */ > module_param(heartbeat, int, 0); > @@ -297,8 +303,11 @@ static int sp5100_tco_timer_init(struct sp5100_tco *tco) > if (val & SP5100_WDT_FIRED) > wdd->bootstatus = WDIOF_CARDRESET; > > - /* Set watchdog action to reset the system */ > - val &= ~SP5100_WDT_ACTION_RESET; > + /* Set watchdog action */ > + if (action) > + val |= SP5100_WDT_ACTION_RESET; > + else > + val &= ~SP5100_WDT_ACTION_RESET; > writel(val, SP5100_WDT_CONTROL(tco->tcobase)); > > /* Set a reasonable heartbeat before we stop the timer */ > -- > 2.37.3 >
diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c index ae54dd33e233..fb426b7d81da 100644 --- a/drivers/watchdog/sp5100_tco.c +++ b/drivers/watchdog/sp5100_tco.c @@ -65,6 +65,12 @@ static struct pci_dev *sp5100_tco_pci; /* module parameters */ +#define WATCHDOG_ACTION 0 +static bool action = WATCHDOG_ACTION; +module_param(action, bool, 0); +MODULE_PARM_DESC(action, "Action taken when watchdog expires, 0 to reset, 1 to poweroff (default=" + __MODULE_STRING(WATCHDOG_ACTION) ")"); + #define WATCHDOG_HEARTBEAT 60 /* 60 sec default heartbeat. */ static int heartbeat = WATCHDOG_HEARTBEAT; /* in seconds */ module_param(heartbeat, int, 0); @@ -297,8 +303,11 @@ static int sp5100_tco_timer_init(struct sp5100_tco *tco) if (val & SP5100_WDT_FIRED) wdd->bootstatus = WDIOF_CARDRESET; - /* Set watchdog action to reset the system */ - val &= ~SP5100_WDT_ACTION_RESET; + /* Set watchdog action */ + if (action) + val |= SP5100_WDT_ACTION_RESET; + else + val &= ~SP5100_WDT_ACTION_RESET; writel(val, SP5100_WDT_CONTROL(tco->tcobase)); /* Set a reasonable heartbeat before we stop the timer */
Allow configuring the "action" bit, as documented in [1]. Previously, the only action supported by this module was to reset the system (0). It can now be configured to power off (1) instead. [1]: https://www.amd.com/system/files/TechDocs/44413.pdf Signed-off-by: Vladimir Panteleev <git@vladimir.panteleev.md> --- Changes since v1: - Drop the rename of the SP5100_WDT_ACTION_RESET define - Make the new parameter not visible in sysfs for consistency drivers/watchdog/sp5100_tco.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)