| Commit message (Collapse) | Author | Age |
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the underlying platform fails (FreeBSD is the only one I'm aware
of that does this!), we use a global lock or condition variable instead.
This means that our lock initializers never ever fail.
Probably we could eliminate most of this for Linux and Darwin, since
on those platforms, mutex and condvar initialization reasonably never
fails. Initial benchmarks show little difference either way -- so we
can revisit (optimize) later.
This removes a lot of otherwise untested code in error cases and so forth,
improving coverage and resilience in the face of allocation failures.
Platforms other than POSIX should follow a similar pattern if they need
this. (VxWorks, I'm thinking of you.) Most sane platforms won't have
an issue here, since normally these initializations do not need to allocate
memory. (Reportedly, even FreeBSD has plans to "fix" this in libthr2.)
While here, some bugs were fixed in initialization & teardown.
The fallback code is properly tested with dedicated test cases.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A little benchmarking showed that we were encountering far too many
wakeups, leading to severe performance degradation; we had a bunch
of threads all sleeping on the same condition variable (taskqs)
and this woke them all up, resulting in heavy mutex contention.
Since we only need one of the threads to wake, and we don't care which
one, let's just wake only one. This reduced RTT latency from about
240 us down to about 30 s. (1/8 of the former cost.)
There's still a bunch of tuning to do; performance remains worse than
we would like.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates the two threads per pipe that were being used to provide
basic I/O handling, replacing them with a single global thread for now,
that uses poll and nonblocking I/O. This should lead to great scalability.
The infrastructure is in place to easily expand to multiple polling worker
threads. Some thought needs to be given about how to scale this to engage
multiple CPUs. Horizontal scaling may also shorten the poll() lists easing
C10K problem.
We should look into better solutions than poll() for platforms that have
them (epoll on Linux, kqueue on BSD, and event ports on illumos).
Note that the file descriptors start out in blocking mode for now, but
then are placed into non-blocking mode. This is because the negotiation
phase is not yet callback driven, and so needs to be synchronous.
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|