diff mbox series

[for-next,1/3] selftests/watchdog: add count parameter for watchdog-test

Message ID 20241025013933.6516-1-lizhijian@fujitsu.com
State New
Headers show
Series [for-next,1/3] selftests/watchdog: add count parameter for watchdog-test | expand

Commit Message

Li Zhijian Oct. 25, 2024, 1:39 a.m. UTC
Currently, watchdog-test keep running until it gets a SIGINT. However,
when watchdog-test is executed from the kselftests framework, where it
launches test via timeout which will send SIGTERM in time up. This could
lead to
1. watchdog haven't stop, a watchdog reset is triggered to reboot the OS
   in silent.
2. kselftests gets an timeout exit code, and judge watchdog-test as
  'not ok'

This patch is prepare to fix above 2 issues

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
Hey,
Cover letter is here.

It's notice that a OS reboot was triggerred after ran the watchdog-test
in kselftests framwork 'make run_tests', that's because watchdog-test
didn't stop feeding the watchdog after enable it.

In addition, current watchdog-test didn't adapt to the kselftests
framework which launchs the test with /usr/bin/timeout and no timeout
is expected.
---
 tools/testing/selftests/watchdog/watchdog-test.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Shuah Khan Oct. 27, 2024, 12:28 a.m. UTC | #1
On 10/24/24 19:39, Li Zhijian wrote:
> Currently, watchdog-test keep running until it gets a SIGINT. However,
> when watchdog-test is executed from the kselftests framework, where it
> launches test via timeout which will send SIGTERM in time up. This could
> lead to
> 1. watchdog haven't stop, a watchdog reset is triggered to reboot the OS
>     in silent.
> 2. kselftests gets an timeout exit code, and judge watchdog-test as
>    'not ok'
> 
This test isn't really supposed to be run from kselftest framework.
This is the reason why it isn't included in the default run.

> This patch is prepare to fix above 2 issues

This series needs a separate cover letter explaining how this problem is
being fixed.

> 
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> Hey,
> Cover letter is here.
> 
> It's notice that a OS reboot was triggerred after ran the watchdog-test
> in kselftests framwork 'make run_tests', that's because watchdog-test
> didn't stop feeding the watchdog after enable it.
> 
> In addition, current watchdog-test didn't adapt to the kselftests
> framework which launchs the test with /usr/bin/timeout and no timeout
> is expected.
> ---
>   tools/testing/selftests/watchdog/watchdog-test.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c
> index bc71cbca0dde..2f8fd2670897 100644
> --- a/tools/testing/selftests/watchdog/watchdog-test.c
> +++ b/tools/testing/selftests/watchdog/watchdog-test.c
> @@ -27,7 +27,7 @@
>   
>   int fd;
>   const char v = 'V';
> -static const char sopts[] = "bdehp:st:Tn:NLf:i";
> +static const char sopts[] = "bdehp:st:Tn:NLf:c:i";
>   static const struct option lopts[] = {
>   	{"bootstatus",          no_argument, NULL, 'b'},
>   	{"disable",             no_argument, NULL, 'd'},
> @@ -42,6 +42,7 @@ static const struct option lopts[] = {
>   	{"gettimeleft",		no_argument, NULL, 'L'},
>   	{"file",          required_argument, NULL, 'f'},
>   	{"info",		no_argument, NULL, 'i'},
> +	{"count",         required_argument, NULL, 'c'},
>   	{NULL,                  no_argument, NULL, 0x0}
>   };
>   
> @@ -95,6 +96,7 @@ static void usage(char *progname)
>   	printf(" -n, --pretimeout=T\tSet the pretimeout to T seconds\n");
>   	printf(" -N, --getpretimeout\tGet the pretimeout\n");
>   	printf(" -L, --gettimeleft\tGet the time left until timer expires\n");
> +	printf(" -c, --count\tStop after feeding the watchdog count times\n");
>   	printf("\n");
>   	printf("Parameters are parsed left-to-right in real-time.\n");
>   	printf("Example: %s -d -t 10 -p 5 -e\n", progname);
> @@ -174,7 +176,7 @@ int main(int argc, char *argv[])
>   	unsigned int ping_rate = DEFAULT_PING_RATE;
>   	int ret;
>   	int c;
> -	int oneshot = 0;
> +	int oneshot = 0, stop = 1, count = 0;
>   	char *file = "/dev/watchdog";
>   	struct watchdog_info info;
>   	int temperature;
> @@ -307,6 +309,9 @@ int main(int argc, char *argv[])
>   			else
>   				printf("WDIOC_GETTIMELEFT error '%s'\n", strerror(errno));
>   			break;
> +		case 'c':
> +			stop = 0;
> +			count = strtoul(optarg, NULL, 0);
>   		case 'f':
>   			/* Handled above */
>   			break;
> @@ -336,8 +341,8 @@ int main(int argc, char *argv[])
>   
>   	signal(SIGINT, term);
>   
> -	while (1) {
> -		keep_alive();
> +	while (stop || count--) {
> +		exit_code = keep_alive();
>   		sleep(ping_rate);
>   	}
>   end:
Shuah Khan Oct. 29, 2024, 2:24 a.m. UTC | #2
On 10/28/24 00:44, Shuah Khan wrote:
> On 10/28/24 00:32, Zhijian Li (Fujitsu) wrote:
>>
>>
>> On 28/10/2024 14:25, Shuah Khan wrote:
>>> On 10/28/24 00:06, Zhijian Li (Fujitsu) wrote:
>>>> linux/tools/testing/selftests/watchdog# make run_tests
>>>> TAP version 13
>>>> 1..1
>>>> # timeout set to 45
>>>> # selftests: watchdog: watchdog-test
>>>> # Watchdog Ticking Away!
>>>> # .............................................#
>>>> not ok 1 selftests: watchdog: watchdog-test # TIMEOUT 45 seconds
>>>>
>>>>
>>>> And i got warning in dmesg
>>>>
>>>> [ 1953.229511] watchdog: watchdog0: watchdog did not stop!
>>>>
>>>>
>>>>
>>>>
>>>
>>> Run "make run_tests" under strace and send me the output.
>>
>>
>> Could you share the exact command, how to 'Run "make run_tests" under strace'
>>
> 
> strace make run_tests > strace.out 2>&1
> 
> Send me strace.out

Thank you for the strace output. kselftest uses a timeout to terminate
hung tests - that timeout is 45 seconds. When you run "make run_tests"
under watchdog directory, you are running into this.

Yes your fix to add SIGTERM handling makes sense. Please also handle
other signals - SIGKILL, SIGQUIT.

Thanks for the find.

thanks,
-- Shuah
Li Zhijian Oct. 29, 2024, 3:03 a.m. UTC | #3
On 29/10/2024 10:24, Shuah Khan wrote:
> On 10/28/24 00:44, Shuah Khan wrote:
>> On 10/28/24 00:32, Zhijian Li (Fujitsu) wrote:
>>>
>>>
>>> On 28/10/2024 14:25, Shuah Khan wrote:
>>>> On 10/28/24 00:06, Zhijian Li (Fujitsu) wrote:
>>>>> linux/tools/testing/selftests/watchdog# make run_tests
>>>>> TAP version 13
>>>>> 1..1
>>>>> # timeout set to 45
>>>>> # selftests: watchdog: watchdog-test
>>>>> # Watchdog Ticking Away!
>>>>> # .............................................#
>>>>> not ok 1 selftests: watchdog: watchdog-test # TIMEOUT 45 seconds
>>>>>
>>>>>
>>>>> And i got warning in dmesg
>>>>>
>>>>> [ 1953.229511] watchdog: watchdog0: watchdog did not stop!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> Run "make run_tests" under strace and send me the output.
>>>
>>>
>>> Could you share the exact command, how to 'Run "make run_tests" under strace'
>>>
>>
>> strace make run_tests > strace.out 2>&1
>>
>> Send me strace.out
> 
> Thank you for the strace output. kselftest uses a timeout to terminate
> hung tests - that timeout is 45 seconds. When you run "make run_tests"
> under watchdog directory, you are running into this.
> 
> Yes your fix to add SIGTERM handling makes sense. Please also handle
> other signals - SIGKILL, SIGQUIT.


Understood, I will update the patch soon.


Thanks
Zhijian

> 
> Thanks for the find.
> 
> thanks,
> -- Shuah
diff mbox series

Patch

diff --git a/tools/testing/selftests/watchdog/watchdog-test.c b/tools/testing/selftests/watchdog/watchdog-test.c
index bc71cbca0dde..2f8fd2670897 100644
--- a/tools/testing/selftests/watchdog/watchdog-test.c
+++ b/tools/testing/selftests/watchdog/watchdog-test.c
@@ -27,7 +27,7 @@ 
 
 int fd;
 const char v = 'V';
-static const char sopts[] = "bdehp:st:Tn:NLf:i";
+static const char sopts[] = "bdehp:st:Tn:NLf:c:i";
 static const struct option lopts[] = {
 	{"bootstatus",          no_argument, NULL, 'b'},
 	{"disable",             no_argument, NULL, 'd'},
@@ -42,6 +42,7 @@  static const struct option lopts[] = {
 	{"gettimeleft",		no_argument, NULL, 'L'},
 	{"file",          required_argument, NULL, 'f'},
 	{"info",		no_argument, NULL, 'i'},
+	{"count",         required_argument, NULL, 'c'},
 	{NULL,                  no_argument, NULL, 0x0}
 };
 
@@ -95,6 +96,7 @@  static void usage(char *progname)
 	printf(" -n, --pretimeout=T\tSet the pretimeout to T seconds\n");
 	printf(" -N, --getpretimeout\tGet the pretimeout\n");
 	printf(" -L, --gettimeleft\tGet the time left until timer expires\n");
+	printf(" -c, --count\tStop after feeding the watchdog count times\n");
 	printf("\n");
 	printf("Parameters are parsed left-to-right in real-time.\n");
 	printf("Example: %s -d -t 10 -p 5 -e\n", progname);
@@ -174,7 +176,7 @@  int main(int argc, char *argv[])
 	unsigned int ping_rate = DEFAULT_PING_RATE;
 	int ret;
 	int c;
-	int oneshot = 0;
+	int oneshot = 0, stop = 1, count = 0;
 	char *file = "/dev/watchdog";
 	struct watchdog_info info;
 	int temperature;
@@ -307,6 +309,9 @@  int main(int argc, char *argv[])
 			else
 				printf("WDIOC_GETTIMELEFT error '%s'\n", strerror(errno));
 			break;
+		case 'c':
+			stop = 0;
+			count = strtoul(optarg, NULL, 0);
 		case 'f':
 			/* Handled above */
 			break;
@@ -336,8 +341,8 @@  int main(int argc, char *argv[])
 
 	signal(SIGINT, term);
 
-	while (1) {
-		keep_alive();
+	while (stop || count--) {
+		exit_code = keep_alive();
 		sleep(ping_rate);
 	}
 end: