summaryrefslogtreecommitdiff
path: root/src/core/taskq.c
Commit message (Collapse)AuthorAge
* fixes #1572 nng creates too many threadsGarrett D'Amore2024-01-01
| | | | | | | | | | | | This further limits some of the thread counts, but principally it offers a new runtime facility, nng_init_set_parameter(), which can be used to set certain runtime parameters on the number of threads, provided it is called before the rest of application start up. This facility is quite intentionally "undocumented", at least for now, as we want to limit our commitment to it. Still this should be helpful for applications that need to reduce the number of threads that are created.
* fixes #1574 Non-blocking version of nng_aio_wait / nng_aio_resultGarrett D'Amore2022-04-18
| | | | | | | This introduces a new API, nng_aio_busy(), that can be used to query the status of the aio without blocking. Some minor documentation fixes are included.
* fixes #1335 nni_taskq_thread grabs task lock unnecessarilyGarrett D'Amore2020-11-10
|
* fixes #960 NNG threads inherit application thread nameGarrett D'Amore2020-08-08
| | | | | | This also exposes an nng_thread_set_name() function for applications to use. All NNG thread names start with "nng:". Note that support is highly dependent on the operating system.
* fixes #1236 Deadlock triggered on nng_closeGarrett D'Amore2020-05-17
| | | | fixes #1219 nng_close occasionally hang on Windows
* fixes #1202 More than 120 threads was started by NNGGarrett D'Amore2020-02-24
| | | | | | | | This introduces a new CMake option, NNG_MAX_TASKQ_THREADS, with a default value of 16. The number of taskq workers will generally be calculated as vcpu * 2. This new value, if not zero, sets an upper bound. Note that the value should be at least two, in order to ensure no deadlocks occur.
* fixes #1117 task structures should be inlinedGarrett D'Amore2020-01-06
|
* fixes #1116 task_reap can be eliminatedGarrett D'Amore2020-01-05
|
* fixes #769 How to limit worker threadsMatt Gigli2018-12-16
| | | | | * Expose cmake variable to set number of DNS resolver threads: NNG_RESOLV_CONCURRENCY * Expose cmake variable to set number of taskq threads: NNG_NUM_TASKQ_THREADS
* fixes #589 tsan found racesGarrett D'Amore2018-07-16
|
* fixes #32 autoscale based on CPUs availableGarrett D'Amore2018-06-12
| | | | | | This should work on both Windows and the most common POSIX variants. We will create at least two threads for running completions, but there are numerous other threads in the code.
* fixes #511 Want to be able to have deferred destroy of tasks and aiosGarrett D'Amore2018-06-09
| | | | | | | | | | Essentially, if we're destroying an aio, and we are doing so from the thread that is running the callback, then we should defer the destruction of the task until it returns. Note that calling nni_aio_wait() or anything else that calls it from the callback is still verboten and will result in a single party deadlock.
* fixes #451 task finalization could be better/smarter (resolver)Garrett D'Amore2018-05-17
| | | | | | | | | | | | | | | | | | | | | | | This changes nni_task_fini to always run synchronously, waiting for the task to finish before cleaning up. Much simpler code. Additionally, we've refactored the resolver code to avoid the use of taskqs, which added complexity and inefficiency. The approach of just allocating its own threads and a work queue to process them turns out to be vastly simpler, and actually reduces extra allocations and context switches. wip POSIX resolv threads. (Taskqs are just overhead and complexity here.) Windows resolver changes. Task cleanup. fix up windows mutex.
* fixes #445 crash in taskq_threadGarrett D'Amore2018-05-16
| | | | | | | | This changes the array of flags, which was confusing, brittle, and racy, into a much simpler reference (busy) count on the task structures. This allows us to support certain kinds of "reentrant" dispatching, where either a synchronous or asynchronous task can reschedule / dispatch itself. The new code also helps reduce certain lock pressure, as a bonus.
* fixes #433 tasks can leakGarrett D'Amore2018-05-15
| | | | | | While here I also improved the taskq.h comments (and removed a stale prototype for nni_task_cancel), and addressed leaks in the reqstress and multistress test programs.
* fixes #431 hang in taskq_waitGarrett D'Amore2018-05-15
| | | | | | | | | | | | | | fixes #429 async websocket reap leads to crash This tightens up the code for shutdown, ensuring that transport callbacks are completely stopped before advancing to the next step of teardown of transport pipes or endpoints. It also fixes a problem where task_wait would sometimes get "stuck" as tasks transitioned between asynch and synchronous completions. Finally, it saves a few cycles by only calling a cancellation callback once during cancellation of an aio.
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.
* fix a number of cppcheck complaints (not all)Garrett D'Amore2018-04-24
|
* fixes #45 expose aio to applicationsGarrett D'Amore2017-10-25
| | | | | | | | | | While here we added a test for the aio stuff, and cleaned up some dead code for the old fd notifications. There were a few improvements to shorten & clean code elsewhere, such as short-circuiting task wait when the task has no callback. The legacy sendmsg() and recvmsg() APIs are still in the socket core until we convert the device code to use the aios.
* Provide versions of mutex, condvar, and aio init that never fail.Garrett D'Amore2017-08-16
| | | | | | | | | | | | | | | | | | | | | | | If the underlying platform fails (FreeBSD is the only one I'm aware of that does this!), we use a global lock or condition variable instead. This means that our lock initializers never ever fail. Probably we could eliminate most of this for Linux and Darwin, since on those platforms, mutex and condvar initialization reasonably never fails. Initial benchmarks show little difference either way -- so we can revisit (optimize) later. This removes a lot of otherwise untested code in error cases and so forth, improving coverage and resilience in the face of allocation failures. Platforms other than POSIX should follow a similar pattern if they need this. (VxWorks, I'm thinking of you.) Most sane platforms won't have an issue here, since normally these initializations do not need to allocate memory. (Reportedly, even FreeBSD has plans to "fix" this in libthr2.) While here, some bugs were fixed in initialization & teardown. The fallback code is properly tested with dedicated test cases.
* Idempotent taskq finalizers.Garrett D'Amore2017-08-14
|
* Thundering herd kills performance.Garrett D'Amore2017-08-10
| | | | | | | | | | | | | | A little benchmarking showed that we were encountering far too many wakeups, leading to severe performance degradation; we had a bunch of threads all sleeping on the same condition variable (taskqs) and this woke them all up, resulting in heavy mutex contention. Since we only need one of the threads to wake, and we don't care which one, let's just wake only one. This reduced RTT latency from about 240 us down to about 30 s. (1/8 of the former cost.) There's still a bunch of tuning to do; performance remains worse than we would like.
* Subsystem initialize is idempotent; simplify cleanup.Garrett D'Amore2017-08-07
|
* Refactor AIO logic to close numerous races and reduce complexity.Garrett D'Amore2017-08-04
| | | | | | | | | This passes valgrind 100% clean for both helgrind and deep leak checks. This represents a complete rethink of how the AIOs work, and much simpler synchronization; the provider API is a bit simpler to boot, as a number of failure modes have been simply eliminated. While here a few other minor bugs were squashed.
* More reliable taskq fini; avoids deadlock during shutdown.Garrett D'Amore2017-08-02
|
* Eliminate the separate AIO wake callback, making nni_aio_waitGarrett D'Amore2017-07-21
| | | | block for any AIO completion.
* Simpler taskq API.Garrett D'Amore2017-07-21
| | | | | | | The queue is bound at initialization time of the task, and we call entries just tasks, so we don't have to pass around a taskq pointer across all the calls. Further, nni_task_dispatch is now guaranteed to succeed.
* Yet more race condition fixes.Garrett D'Amore2017-07-20
| | | | | | | | | We need to remember that protocol stops can run synchronously, and therefore we need to wait for the aio to complete. Further, we need to break apart shutting down aio activity from deallocation, as we need to shut down *all* async activity before deallocating *anything*. Noticed that we had a pipe race in the surveyor pattern too.
* Always run the AIO completion logic.Garrett D'Amore2017-07-19
| | | | | | | | We have seen some yet another weird situation where we had an orphaned pipe, which was caused by not completing the callback. If we are going to run nni_aio_fini, we should still run the callback (albeit with a return value of NNG_ECANCELED or somesuch) to be sure that we can't orphan stuff.
* Give up on uncrustify; switch to clang-format.Garrett D'Amore2017-07-10
|
* Refactor stop again, closing numerous races (thanks valgrind!)Garrett D'Amore2017-06-28
|
* Fix taskq_cancel race.Garrett D'Amore2017-06-08
|
* Fix leaking taskq data.Garrett D'Amore2017-03-12
|
* Pipeline protocol now entirely callback driven.Garrett D'Amore2017-03-04
|
* Taskq implementation.Garrett D'Amore2017-02-18