summaryrefslogtreecommitdiff
path: root/src/core/aio.c
Commit message (Collapse)AuthorAge
* fixes #1572 nng creates too many threadsGarrett D'Amore2024-01-01
| | | | | | | | | | | | This further limits some of the thread counts, but principally it offers a new runtime facility, nng_init_set_parameter(), which can be used to set certain runtime parameters on the number of threads, provided it is called before the rest of application start up. This facility is quite intentionally "undocumented", at least for now, as we want to limit our commitment to it. Still this should be helpful for applications that need to reduce the number of threads that are created.
* fixes #1728 surveyor could be simplified to not use timerGarrett D'Amore2023-12-17
|
* fixes #1523 rare SEGV in sub nni_list_removeGarrett D'Amore2023-11-25
| | | | | | | | | | | | Credit goes to Wu Xuan (@willwu1217) for diagnosing and proposing a fix as part of #1695. This approach takes a revised approach to avoid adding extra memory, and it also is slightly faster as we do not need to update both pointers in the linked list, by reusing the reap node. As part of this a new internal API, nni_aio_completions, is introduced. In all likelihood we will be able to use this to solve some similar crashes in other areas of the code.
* fixes #1619 expose expire threads tunablesPaulo Henrique Silva2023-08-27
| | | | | | | | | | | | | | | | | | This change makes expire threads tunable follows the same strategy as taskq threads tunables. Add NNG_NUM_EXPIRE_THREADS to override the default behavior (`n_cpu` expire threads). The NNG_MAX_EXPIRE_THREADS limit is always applied if > 0, even if you specify the desired number of threads using NNG_NUM_EXPIRE_THREADS. NNG_EXPIRE_THREADS is not used anymore. This was only referenced in the code but never defined on CMake. The logic to cap expire threads between 1 and 256 was removed. If users set no limits, whatever value they choose will be used instead of being silently overridden by us.
* fixes #1619 Expose NNG_MAX_EXPIRE_THREADS via CMakeGarrett D'Amore2023-04-19
|
* Revert "fixes 1543 (#1616)"Garrett D'Amore2023-02-05
| | | | This reverts commit 8461c7207b440f5ba8c51b2236fcfa178f415a6f.
* fixes 1543 (#1616)josh salit2023-02-05
| | | fixes #1543 by aborting tasks that may have been prepped, but not yet started.
* fixes #1574 Non-blocking version of nng_aio_wait / nng_aio_resultGarrett D'Amore2022-04-18
| | | | | | | This introduces a new API, nng_aio_busy(), that can be used to query the status of the aio without blocking. Some minor documentation fixes are included.
* Replace nni_aio_prov_set_extra with nni_aio_prov_set_data.Garrett D'Amore2021-12-31
| | | | | | This takes one less parameter, and is simpler. It will let us reclaim the aio_prov_extra data space as well, so that we can use it for other purposes.
* Remove unused eq_len member.Garrett D'Amore2021-10-11
|
* fixes #1488 aio expiration list performance work neededGarrett D'Amore2021-08-09
| | | | | | There were several problems with the array implementation, both from performance and from correctness. This corrects those errors (hopefully) and restores the expiration lists as linked lists.
* * FIX #1486 by waking up latest aio each time. (#1487)JaylinYu2021-08-09
|
* Fix some unused variables.Garrett D'Amore2021-07-22
|
* fixes #1475 nni_aio_begin should not dispatch task on stopped aioGarrett D'Amore2021-07-22
|
* Fix the wrong ratio when expire queue shrink. (#1470)wangha2021-07-16
|
* Turn aio expire queue from nni list to array for efficiency. (#1449)wangha2021-07-06
|
* fixes #1172 nni_aio_lk is white hotGarrett D'Amore2020-12-20
|
* fixes #1380 nni_aio optimizationsGarrett D'Amore2020-12-20
| | | | fixes #1048 nng_aio reuse error messages are unhelpful
* fixes #1377 nni_aio_fini should not reacquire nni_aio_lkGarrett D'Amore2020-12-20
|
* fixes #1372 nni_reap could be smallerGarrett D'Amore2020-12-19
|
* fixes #1313 support deferred nng_aio destructionGarrett D'Amore2020-12-12
|
* fixes #1337 nni aio user data could be removedGarrett D'Amore2020-11-10
|
* Minor spelling tweaks for the aio framework.Garrett D'Amore2020-11-09
|
* fixes #1311 reduce wasted use for nni_aioGarrett D'Amore2020-10-31
| | | | | | | | | | fixes #1317 IPv6 listener get port is incorrect fixes #1319 Want symbolic service names This is phase 1 of reducing the memory foot-print of aios, and also of pipes. This removes the largest consumer the socket address information, from the aio, which was only used by a few consumers.
* fixes #960 NNG threads inherit application thread nameGarrett D'Amore2020-08-08
| | | | | | This also exposes an nng_thread_set_name() function for applications to use. All NNG thread names start with "nng:". Note that support is highly dependent on the operating system.
* fixes #1236 Deadlock triggered on nng_closeGarrett D'Amore2020-05-17
| | | | fixes #1219 nng_close occasionally hang on Windows
* Fix typos in commentsEvgeny Ermakov2020-02-13
|
* fixes #1094 Consider in-lining task and aioGarrett D'Amore2020-01-08
| | | | | This only does it for rep, but it also has changes that should increase the overall test coverage for the REP protocol
* fixes #1117 task structures should be inlinedGarrett D'Amore2020-01-06
|
* fixes #1096 inline all 16 iovs in aio (also consider reducing -- to 8?)Garrett D'Amore2020-01-04
| | | | fixes #1097 aio prov_data not used at all
* Address complaints found by lgtm.com.Garrett D'Amore2019-12-11
|
* fixes #872 create unified nng_stream APIGarrett D'Amore2019-02-16
| | | | | | | | | This is a major change, and includes changes to use a polymorphic stream API for all transports. There have been related bugs fixed along the way. Additionally the man pages have changed. The old non-polymorphic APIs are removed now. This is a breaking change, but the old APIs were never part of any released public API.
* fixes #772 ineffectual assignment in aio_initGarrett D'Amore2018-11-01
|
* remove redundant zero memsetQXSoftware2018-09-09
|
* fixes #664 aio cancellation could be betterGarrett D'Amore2018-08-20
| | | | | | | | | This changes the signature of the aio cancellation routines to take the argument for cancellation directly, so we do not need to lookup the argument using the nni_aio_get_prov_data. We should probably consider eliminating nni_aio_get_prov_data, and co, and changing the prov_extra to reflect prov_data. Later.
* fixes #625 aio->a_stop/aio_begin may be too severeGarrett D'Amore2018-08-07
|
* fixes #623 nni_aio_stop could be betterGarrett D'Amore2018-08-06
|
* Revert "fixes #599 nng_dial sync should not return until added to socket"Garrett D'Amore2018-08-06
| | | | | This changeset needs work. We are seeing errors described by This reverts commit d7f7c896c0ede24249ef63b1e45b1878bf4bd473.
* fixes #599 nng_dial sync should not return until added to socketGarrett D'Amore2018-08-05
| | | | | | | | | | fixes #208 pipe start should occur before connect / accept fixes #616 Race condition closing between header & body This refactors the transports to handle their own connection handshaking before passing the pipe to the socket. This changes and simplifies the setup. This also fixes a rather challenging race condition described by #616.
* fixes #523 dialers could support multiple outstanding dial requestsGarrett D'Amore2018-07-16
| | | | | | | | | | | | | | | | | | | | | | | | fixes #179 DNS resolution should be done at connect time fixes #586 Windows IO completion port work could be better fixes #339 Windows iocp could use synchronous completions fixes #280 TCP abstraction improvements This is a rather monstrous set of changes, which refactors TCP, and the underlying Windows I/O completion path logic, in order to obtain a cleaner, simpler API, with support for asynchronous DNS lookups performed on connect rather than initialization time, the ability to have multiple connects or accepts pending, as well as fewer extraneous function calls. The Windows code also benefits from greatly reduced context switching, fewer lock operations performed, and a reduced number of system calls on the hot code path. (We use automatic event resetting instead of manual.) Some dead code was removed as well, and a few potential edge case leaks on failure paths (in the websocket code) were plugged. Note that all TCP based transports benefit from this work. The IPC code on Windows still uses the legacy IOCP for now, as does the UDP code (used for ZeroTier.) We will be converting those soon too.
* fixes #535 aio->a_closed and aio->a_stop could be consolidatedGarrett D'Amore2018-06-12
|
* fixes #533 nni_aio_begin should not dispatch task on NNG_ECLOSED.Garrett D'Amore2018-06-12
| | | | | | | | | | This changes nni_aio_begin so that it immediately terminates when it encounters aio->a_closed, much like it does for aio->a_stop. The semantic for nni_aio_close() is supposed to be like nni_aio_stop(), but without blocking. I suspect that this might be responsible for use-after-free bugs that seem to have been rearing their head lately.
* fixes #511 Want to be able to have deferred destroy of tasks and aiosGarrett D'Amore2018-06-09
| | | | | | | | | | Essentially, if we're destroying an aio, and we are doing so from the thread that is running the callback, then we should defer the destruction of the task until it returns. Note that calling nni_aio_wait() or anything else that calls it from the callback is still verboten and will result in a single party deadlock.
* fixes #445 crash in taskq_threadGarrett D'Amore2018-05-16
| | | | | | | | This changes the array of flags, which was confusing, brittle, and racy, into a much simpler reference (busy) count on the task structures. This allows us to support certain kinds of "reentrant" dispatching, where either a synchronous or asynchronous task can reschedule / dispatch itself. The new code also helps reduce certain lock pressure, as a bonus.
* fixes #431 hang in taskq_waitGarrett D'Amore2018-05-15
| | | | | | | | | | | | | | fixes #429 async websocket reap leads to crash This tightens up the code for shutdown, ensuring that transport callbacks are completely stopped before advancing to the next step of teardown of transport pipes or endpoints. It also fixes a problem where task_wait would sometimes get "stuck" as tasks transitioned between asynch and synchronous completions. Finally, it saves a few cycles by only calling a cancellation callback once during cancellation of an aio.
* fixes #419 want to nni_aio_stop without blocking (#428)Garrett D'Amore2018-05-15
| | | | | | | | | | | | | | | | * fixes #419 want to nni_aio_stop without blocking This actually introduces an nni_aio_close() API that causes nni_aio_begin to return NNG_ECLOSED, while scheduling a callback on the AIO to do an NNG_ECLOSED as well. This should be called in non-blocking close() contexts instead of nni_aio_stop(), and the cases where we call nni_aio_fini() multiple times are updated updated to add nni_aio_stop() calls on all "interlinked" aios before finalizing them. Furthermore, we call nni_aio_close() as soon as practical in the close path. This closes an annoying race condition where the callback from a lower subsystem could wind up rescheduling an operation that we wanted to abort.
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.
* fixes #422 nng_aio_fini_cb() could go awayGarrett D'Amore2018-05-09
|
* fixes #346 nng_recv() sometimes acts on null `msg` pointerGarrett D'Amore2018-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This closes a fundamental flaw in the way aio structures were handled. In paticular, aio expiration could race ahead, and fire before the aio was properly registered by the provider. This ultimately led to the possibility of duplicate completions on the same aio. The solution involved breaking up nni_aio_start into two functions. nni_aio_begin (which can be run outside of external locks) simply validates that nni_aio_fini() has not been called, and clears certain fields in the aio to make it ready for use by the provider. nni_aio_schedule does the work to register the aio with the expiration thread, and should only be called when the aio is actually scheduled for asynchronous completion. nni_aio_schedule_verify does the same thing, but returns NNG_ETIMEDOUT if the aio has a zero length timeout. This change has a small negative performance impact. We have plans to rectify that by converting nni_aio_begin to use a locklesss flag for the aio->a_fini bit. While we were here, we fixed some error paths in the POSIX subsystem, which would have returned incorrect error codes, and we made some optmizations in the message queues to reduce conditionals while holding locks in the hot code path.
* Fix valgrind warning in assert.Garrett D'Amore2018-04-10
| | | | | | | This technically doesn't need the lock, as we're only trying to catch a possible problem during development rather than in the field (and this should never occur), but the false positive in valgrind obscures other possible errors, so leave it in the lock.