From ac5f0ef7cf501693a9db2fcbd95b7cde419cbb2a Mon Sep 17 00:00:00 2001 From: Garrett D'Amore Date: Thu, 10 Aug 2017 00:10:50 -0700 Subject: Thundering herd kills performance. A little benchmarking showed that we were encountering far too many wakeups, leading to severe performance degradation; we had a bunch of threads all sleeping on the same condition variable (taskqs) and this woke them all up, resulting in heavy mutex contention. Since we only need one of the threads to wake, and we don't care which one, let's just wake only one. This reduced RTT latency from about 240 us down to about 30 s. (1/8 of the former cost.) There's still a bunch of tuning to do; performance remains worse than we would like. --- src/core/thread.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'src/core/thread.h') diff --git a/src/core/thread.h b/src/core/thread.h index b528af2c..94b2a984 100644 --- a/src/core/thread.h +++ b/src/core/thread.h @@ -52,6 +52,9 @@ extern void nni_cv_fini(nni_cv *cv); // nni_cv_wake wakes all waiters on the condition variable. extern void nni_cv_wake(nni_cv *cv); +// nni_cv_wake wakes just one waiter on the condition variable. +extern void nni_cv_wake1(nni_cv *cv); + // nni_cv_wait waits until nni_cv_wake is called on the condition variable. // The wait is indefinite. Premature wakeups are possible, so the caller // must verify any related condition. -- cgit v1.2.3-70-g09d2