Message ID | 20241024162620.349934-1-jwyatt@redhat.com |
---|---|
State | New |
Headers | show |
Series | pi_stress: Add memory barrier to resolve crash | expand |
On Thu, 2024-10-24 at 12:26 -0400, John B. Wyatt IV wrote: > @@ -952,12 +953,31 @@ void *high_priority(void *arg) > ("high_priority[%d]: > pthread_barrier_wait(finish): %x", p->id, status); > return NULL; > } > + > + > + /** > + * The pthread_barrier_wait should guarantee that > only one > + * thread at a time interacts with the variables > below that > + * if block. How is that guaranteed? > + * > + * GCC -O2 rearranges the two increments above the > wait > + * function calls causing a race issue if you run > this > + * near full cores with one core (2 threads) free > for > + * housekeeping. This causes a crash at around 2 > hour of > + * running. You can prove this by commenting out the > barrier > + * and compiling with `-O0`. The crash does not show > with > + * -O0. > + * > + * Add a memory barrier to force GCC to increment > the variables > + * below the pthread calls. This funcion depends on > C11. > + **/ > + atomic_thread_fence(memory_order_seq_cst); That's kind of nuts that pthread_barrier_wait() doesn't even act as a compile barrier (which a simple external function call should do), much less a proper memory barrier. In fact I'd go beyond that to call it a bug, just as if there were a mutex implementation that required users to do this. And POSIX agrees: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_12 Is it possible that something else is going on here, and it's just being hidden by changing the timing? What exactly is the race you're seeing? It looks like there should be only one high priority thread per group, so is it racing with a reader, or watchdog_clear(), or...?How does the crash happen? What is the helgrind output? -Crystal
Hi John, On 2024-10-24, "John B. Wyatt IV" <jwyatt@redhat.com> wrote: > + /** > + * The pthread_barrier_wait should guarantee that only one > + * thread at a time interacts with the variables below that > + * if block. The pthread_barrier_wait() does exactly the opposite. It guarantees that multiple threads continue at the same time. Taking a look at some users of @finish_barrier, we see this means that low_priority() will read p->total while high_priority() is performing the non-atomic operation p->total++. > + * > + * GCC -O2 rearranges the two increments above the wait > + * function calls causing a race issue if you run this > + * near full cores with one core (2 threads) free for > + * housekeeping. This causes a crash at around 2 hour of > + * running. You can prove this by commenting out the barrier > + * and compiling with `-O0`. The crash does not show with > + * -O0. Turning off optimization is not a proof. Show us the assembly code. > + * > + * Add a memory barrier to force GCC to increment the variables > + * below the pthread calls. This funcion depends on C11. > + **/ What you are talking about are compiler barriers, which are for forcing the compiler to obey instruction ordering. Memory barriers are for guarenteed memory ordering for CPUs at _runtime_. > + atomic_thread_fence(memory_order_seq_cst); A single memory barrier makes no sense. Memory barriers must be paired because you are ordering memory access between multiple CPUs. > + > /* update the group stats */ > p->total++; If you want multiple tasks to be able to modify and read these variables, then either use atomic operations or use locking. That is what such mechanisms are for. John Ogness
On Thu, Oct 24, 2024 at 12:26:14PM -0400, John B. Wyatt IV wrote: > This patch is also an error report seen on RHEL 9 and Fedora 40. > pi_stress crashes if you run this near full cores with one core > (2 threads) free for housekeeping. The crash usually happens at > around 2 hours of pi_stress running. This issue was reproduced > on a RHEL server with 2 sockets of 10 cores/20 threads (total 40 > threads) and a Fedora server with 2 sockets of 56 cores/112 > threads (total 226 threads). > > The pthread_barrier_wait should guarantee that only one > thread at a time interacts with the variables below that > if block. > > GCC -O2 optimization rearranges the two increments above the wait > function calls. This causes a race issue found by Helgrind. > You can prove this by commenting out the memory barrier > and compiling with `-O0`. The crash does not happen with -O0. > Thank you to Valentin Schneider <vschneid@redhat.com> for > suggesting about -O2. > > Add a memory barrier to force GCC to increment the variables > below the pthread calls. The memory barrier prevented any crashes > for several 24 hours tests. This function depends on C11. > > Reported-by: Valgrind's Helgrind tool > > Signed-off-by: "John B. Wyatt IV" <jwyatt@redhat.com> > Signed-off-by: "John B. Wyatt IV" <sageofredondo@gmail.com> > --- > src/pi_tests/pi_stress.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/src/pi_tests/pi_stress.c b/src/pi_tests/pi_stress.c > index 371e906..37023a1 100644 > --- a/src/pi_tests/pi_stress.c > +++ b/src/pi_tests/pi_stress.c > @@ -44,6 +44,7 @@ > #include <stdint.h> > #include <inttypes.h> > #include <limits.h> > +#include <stdatomic.h> > > #include "rt-sched.h" > #include "rt-utils.h" > @@ -952,12 +953,31 @@ void *high_priority(void *arg) > ("high_priority[%d]: pthread_barrier_wait(finish): %x", p->id, status); > return NULL; > } > + > + > + /** > + * The pthread_barrier_wait should guarantee that only one > + * thread at a time interacts with the variables below that > + * if block. > + * > + * GCC -O2 rearranges the two increments above the wait > + * function calls causing a race issue if you run this > + * near full cores with one core (2 threads) free for > + * housekeeping. This causes a crash at around 2 hour of > + * running. You can prove this by commenting out the barrier > + * and compiling with `-O0`. The crash does not show with > + * -O0. > + * > + * Add a memory barrier to force GCC to increment the variables > + * below the pthread calls. This funcion depends on C11. > + **/ I agree with Crystal. AFAIU, the compiler should not reorder before pthread_barrier_wait(). Something odd is going on here. However, I am not familiar with the code base. What's the repo for this? Btw, did you look at the generated assembly code? > + atomic_thread_fence(memory_order_seq_cst); > + If the problem is program order, a simple compiler barrier with acquire semantics should work: atomic_signal_fence(memory_order_acquire); > /* update the group stats */ > p->total++; > > /* update the watchdog counter */ > p->watchdog++; > - > } > set_shutdown_flag(); > > -- > 2.47.0 >
On Tue, 29 Oct 2024, Wander Lairson Costa wrote: > On Thu, Oct 24, 2024 at 12:26:14PM -0400, John B. Wyatt IV wrote: > > This patch is also an error report seen on RHEL 9 and Fedora 40. > > pi_stress crashes if you run this near full cores with one core > > (2 threads) free for housekeeping. The crash usually happens at > > around 2 hours of pi_stress running. This issue was reproduced > > on a RHEL server with 2 sockets of 10 cores/20 threads (total 40 > > threads) and a Fedora server with 2 sockets of 56 cores/112 > > threads (total 226 threads). > > > > The pthread_barrier_wait should guarantee that only one > > thread at a time interacts with the variables below that > > if block. > > > > GCC -O2 optimization rearranges the two increments above the wait > > function calls. This causes a race issue found by Helgrind. > > You can prove this by commenting out the memory barrier > > and compiling with `-O0`. The crash does not happen with -O0. > > Thank you to Valentin Schneider <vschneid@redhat.com> for > > suggesting about -O2. > > > > Add a memory barrier to force GCC to increment the variables > > below the pthread calls. The memory barrier prevented any crashes > > for several 24 hours tests. This function depends on C11. > > > > Reported-by: Valgrind's Helgrind tool > > > > Signed-off-by: "John B. Wyatt IV" <jwyatt@redhat.com> > > Signed-off-by: "John B. Wyatt IV" <sageofredondo@gmail.com> > > --- > > src/pi_tests/pi_stress.c | 22 +++++++++++++++++++++- > > 1 file changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/src/pi_tests/pi_stress.c b/src/pi_tests/pi_stress.c > > index 371e906..37023a1 100644 > > --- a/src/pi_tests/pi_stress.c > > +++ b/src/pi_tests/pi_stress.c > > @@ -44,6 +44,7 @@ > > #include <stdint.h> > > #include <inttypes.h> > > #include <limits.h> > > +#include <stdatomic.h> > > > > #include "rt-sched.h" > > #include "rt-utils.h" > > @@ -952,12 +953,31 @@ void *high_priority(void *arg) > > ("high_priority[%d]: pthread_barrier_wait(finish): %x", p->id, status); > > return NULL; > > } > > + > > + > > + /** > > + * The pthread_barrier_wait should guarantee that only one > > + * thread at a time interacts with the variables below that > > + * if block. > > + * > > + * GCC -O2 rearranges the two increments above the wait > > + * function calls causing a race issue if you run this > > + * near full cores with one core (2 threads) free for > > + * housekeeping. This causes a crash at around 2 hour of > > + * running. You can prove this by commenting out the barrier > > + * and compiling with `-O0`. The crash does not show with > > + * -O0. > > + * > > + * Add a memory barrier to force GCC to increment the variables > > + * below the pthread calls. This funcion depends on C11. > > + **/ > > I agree with Crystal. AFAIU, the compiler should not reorder before > pthread_barrier_wait(). Something odd is going on here. However, I am > not familiar with the code base. What's the repo for this? This is one of the programs in the rt-tests suite Clone one of the following git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git https://kernel.googlesource.com/pub/scm/utils/rt-tests/rt-tests.git > > Btw, did you look at the generated assembly code? > > > + atomic_thread_fence(memory_order_seq_cst); > > + > > If the problem is program order, a simple compiler barrier with acquire > semantics should work: > > atomic_signal_fence(memory_order_acquire); > > > /* update the group stats */ > > p->total++; > > > > /* update the watchdog counter */ > > p->watchdog++; > > - > > } > > set_shutdown_flag(); > > > > -- > > 2.47.0 > > > > >
diff --git a/src/pi_tests/pi_stress.c b/src/pi_tests/pi_stress.c index 371e906..37023a1 100644 --- a/src/pi_tests/pi_stress.c +++ b/src/pi_tests/pi_stress.c @@ -44,6 +44,7 @@ #include <stdint.h> #include <inttypes.h> #include <limits.h> +#include <stdatomic.h> #include "rt-sched.h" #include "rt-utils.h" @@ -952,12 +953,31 @@ void *high_priority(void *arg) ("high_priority[%d]: pthread_barrier_wait(finish): %x", p->id, status); return NULL; } + + + /** + * The pthread_barrier_wait should guarantee that only one + * thread at a time interacts with the variables below that + * if block. + * + * GCC -O2 rearranges the two increments above the wait + * function calls causing a race issue if you run this + * near full cores with one core (2 threads) free for + * housekeeping. This causes a crash at around 2 hour of + * running. You can prove this by commenting out the barrier + * and compiling with `-O0`. The crash does not show with + * -O0. + * + * Add a memory barrier to force GCC to increment the variables + * below the pthread calls. This funcion depends on C11. + **/ + atomic_thread_fence(memory_order_seq_cst); + /* update the group stats */ p->total++; /* update the watchdog counter */ p->watchdog++; - } set_shutdown_flag();