ksefltests: pidfd: Fix wait_states: Test terminated by timeout
authorLi Zhijian <lizhijian@fujitsu.com>
Thu, 1 Sep 2022 03:10:07 +0000 (03:10 +0000)
committerShuah Khan <skhan@linuxfoundation.org>
Sun, 30 Oct 2022 08:23:41 +0000 (02:23 -0600)
0Day/LKP observed that the kselftest blocks forever since one of the
pidfd_wait doesn't terminate in 1 of 30 runs. After digging into
the source, we found that it blocks at:
ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);

wait_states has below testing flow:
  CHILD                 PARENT
  ---------------+--------------
1 STOP itself
2                   WAIT for CHILD STOPPED
3                   SIGNAL CHILD to CONT
4 CONT
5 STOP itself
5'                  WAIT for CHILD CONT
6                   WAIT for CHILD STOPPED

The problem is that the kernel cannot ensure the order of 5 and 5', once
5 goes first, the test will fail.

we can reproduce it by:
$ while true; do make run_tests -C pidfd; done

Introduce a blocking read in child process to make sure the parent can
check its WCONTINUED.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
tools/testing/selftests/pidfd/pidfd_wait.c

index 070c1c876df15146d59ebab18353e62ad5c87362..c3e2a3041f55f396ef26f307fddb8cc7479c358b 100644 (file)
@@ -95,20 +95,28 @@ TEST(wait_states)
                .flags = CLONE_PIDFD | CLONE_PARENT_SETTID,
                .exit_signal = SIGCHLD,
        };
+       int pfd[2];
        pid_t pid;
        siginfo_t info = {
                .si_signo = 0,
        };
 
+       ASSERT_EQ(pipe(pfd), 0);
        pid = sys_clone3(&args);
        ASSERT_GE(pid, 0);
 
        if (pid == 0) {
+               char buf[2];
+
+               close(pfd[1]);
                kill(getpid(), SIGSTOP);
+               ASSERT_EQ(read(pfd[0], buf, 1), 1);
+               close(pfd[0]);
                kill(getpid(), SIGSTOP);
                exit(EXIT_SUCCESS);
        }
 
+       close(pfd[0]);
        ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WSTOPPED, NULL), 0);
        ASSERT_EQ(info.si_signo, SIGCHLD);
        ASSERT_EQ(info.si_code, CLD_STOPPED);
@@ -117,6 +125,8 @@ TEST(wait_states)
        ASSERT_EQ(sys_pidfd_send_signal(pidfd, SIGCONT, NULL, 0), 0);
 
        ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0);
+       ASSERT_EQ(write(pfd[1], "C", 1), 1);
+       close(pfd[1]);
        ASSERT_EQ(info.si_signo, SIGCHLD);
        ASSERT_EQ(info.si_code, CLD_CONTINUED);
        ASSERT_EQ(info.si_pid, parent_tid);