summaryrefslogtreecommitdiff
path: root/src/core/msgqueue.c
Commit message (Collapse)AuthorAge
* use index together with increment operator(postfix form)QXSoftware2018-10-02
|
* fixes #4 Statistics supportGarrett D'Amore2018-09-03
| | | | | | | | | | | | | | | | This introduces new public APIs for obtaining statistics, and adds some generic stats for dialers, listeners, pipes, and sockets. Also added are stats for inproc and pairv1 protocol. The other protocols and transports will have stats added incrementally as time goes on. A simple test program, and man pages are provided for this. Start by looking at nng_stat(5). Statistics does have some impact, and they can be disabled by using the advanced NNG_ENABLE_STATS (setting it to OFF, it's ON by default) if you need to build a minimized configuration.
* fixes #664 aio cancellation could be betterGarrett D'Amore2018-08-20
| | | | | | | | | This changes the signature of the aio cancellation routines to take the argument for cancellation directly, so we do not need to lookup the argument using the nni_aio_get_prov_data. We should probably consider eliminating nni_aio_get_prov_data, and co, and changing the prov_extra to reflect prov_data. Later.
* fixes #605 NNI_ALLOC_STRUCT/NNI_ALLOC_STRUCTS should zero memoryGarrett D'Amore2018-07-24
|
* fixes #459 SUB should be more aggressive about discarding messagesGarrett D'Amore2018-05-21
| | | | | | | | | As part of this code fix, we needed to add filtering support to the msgq_tryput code path -- it turns out that code path was bypassing the filterfn altogether. Eventually we'll remove all this filtering stuff from the msgq code and replace it with inline filtering directly in sub.
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.
* fix a number of cppcheck complaints (not all)Garrett D'Amore2018-04-24
|
* fixes #364 kill off nni_msgq_set_best_effort()Garrett D'Amore2018-04-24
|
* fixes #342 Want Surveyor/Respondent context supportGarrett D'Amore2018-04-24
| | | | | | | | | | | | | | | | fixes #360 core should nng_aio_begin before nng_aio_finish_error fixes #361 nng_send_aio should check for NULL message fixes #362 nni_msgq does not signal pollable on certain events This adds support for contexts for both sides of the surveyor pattern. Prior to this commit, the raw mode was completely broken, and there were numerous other bugs found and fixed. This integration includes *much* deeper validation of this pattern. Some changes to the core and other patterns have been made, where it was obvioius that we could make such improvements. (The obviousness stemming from the fact that RESPONDENT in particular is very closely derived from REP.)
* fixes #346 nng_recv() sometimes acts on null `msg` pointerGarrett D'Amore2018-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This closes a fundamental flaw in the way aio structures were handled. In paticular, aio expiration could race ahead, and fire before the aio was properly registered by the provider. This ultimately led to the possibility of duplicate completions on the same aio. The solution involved breaking up nni_aio_start into two functions. nni_aio_begin (which can be run outside of external locks) simply validates that nni_aio_fini() has not been called, and clears certain fields in the aio to make it ready for use by the provider. nni_aio_schedule does the work to register the aio with the expiration thread, and should only be called when the aio is actually scheduled for asynchronous completion. nni_aio_schedule_verify does the same thing, but returns NNG_ETIMEDOUT if the aio has a zero length timeout. This change has a small negative performance impact. We have plans to rectify that by converting nni_aio_begin to use a locklesss flag for the aio->a_fini bit. While we were here, we fixed some error paths in the POSIX subsystem, which would have returned incorrect error codes, and we made some optmizations in the message queues to reduce conditionals while holding locks in the hot code path.
* fixes #344 nn_poll() legacy API missingGarrett D'Amore2018-04-16
| | | | | | | | | | | | This includes the test from legacy libnanomsg and a man page. We have refactored the message queue notification system so that it uses nni_pollable, leading we hope to a more consistent system, and reducing the code size and complexity. We also fixed the size of the NN_RCVFD and NN_SNDFD so that they are a SOCKET on Windows systems, rather than an integer. This addresses 64-bit compilation problems.
* fixes #308 Close can blockGarrett D'Amore2018-04-14
| | | | | | | | | | | | | | | | | | | | Ultimately, this just removes the support for lingering altogether. Based on prior experience, lingering has always been unreliable, and was removed in legacy libnanomsg ages ago. The problem is that operating system support for lingering is very inconsistent at best, and for some transports the very concept is somewhat meaningless. Making things worse, we were never able to adequately capture an exit() event from another thread -- so lingering was always a false promise. Applications that need to be sure that messages are delivered should either include an ack in their protocol, use req/rep (which has an ack), or inject a suitable delay of their own. For things going over local networks, an extra delay of 100 msec should be sufficient *most of the time*.
* fixes #247 nngcat needs TLS optionsGarrett D'Amore2018-03-02
| | | | | | | While here we also fixed a bug in the --file handling that we noticed while writing the TLS handling. We also fixed a warning in the core (msgqueue) for set but unused variables.
* fixes #173 Define public HTTP server APIGarrett D'Amore2018-02-01
| | | | | | | | | | | | | | | | | | | | | | | This introduces enough of the HTTP API to support fully server applications, including creation of websocket style protocols, pluggable handlers, and so forth. We have also introduced scatter/gather I/O (rudimentary) for aios, and made other enhancements to the AIO framework. The internals of the AIOs themselves are now fully private, and we have eliminated the aio->a_addr member, with plans to remove the pipe and possibly message members as well. A few other minor issues were found and fixed as well. The HTTP API includes request, response, and connection objects, which can be used with both servers and clients. It also defines the HTTP server and handler objects, which support server applications. Support for client applications will require a client object to be exposed, and that should be happening shortly. None of this is "documented" yet, bug again, we will follow up shortly.
* fixes #143 Protocols and transports should be "configurable"Garrett D'Amore2017-11-02
| | | | | | | | | | | | | | | | | | | | This makes all the protocols and transports optional. All of them except ZeroTier are enabled by default, but you can now disable them (remove from the build) with cmake options. The test suite is modified so that tests still run as much as they can, but skip over things caused by missing functionality from the library (due to configuration). Further, the constant definitions and prototypes for functions that are specific to transports or protocols are moved into appropriate headers, which should be included directly by applications wishing to use these. We have also added and improved documentation -- all of the transports are documented, and several more man pages for protocols have been added. (Req/Rep and Surveyor are still missing.)
* fixes #132 Implement saner notification for file descriptorsGarrett D'Amore2017-10-24
| | | | | | | | | This eliminates the "quasi-functional" notify API altogether. The aio framework will be coming soon to replace it. As a bonus, apps (legacy apps) that use the notification FDs will see improved performance, since we don't have to context switch to give them a notification.
* fixes #112 Need to move some stuff from socket to message queuesGarrett D'Amore2017-10-23
|
* Allocate AIOs dynamically.Garrett D'Amore2017-09-22
| | | | | | | | | | | | | | We allocate AIO structures dynamically, so that we can use them abstractly in more places without inlining them. This will be used for the ZeroTier transport to allow us to create operations consisting of just the AIO. Furthermore, we provide accessors for some of the aio members, in the hopes that we will be able to wrap these for "safe" version of the AIO capability to export to applications, and to protocol and transport implementors. While here we cleaned up the protocol details to use consistently shorter names (no nni_ prefix for static symbols needed), and we also fixed a bug in the surveyor code.
* Provide versions of mutex, condvar, and aio init that never fail.Garrett D'Amore2017-08-16
| | | | | | | | | | | | | | | | | | | | | | | If the underlying platform fails (FreeBSD is the only one I'm aware of that does this!), we use a global lock or condition variable instead. This means that our lock initializers never ever fail. Probably we could eliminate most of this for Linux and Darwin, since on those platforms, mutex and condvar initialization reasonably never fails. Initial benchmarks show little difference either way -- so we can revisit (optimize) later. This removes a lot of otherwise untested code in error cases and so forth, improving coverage and resilience in the face of allocation failures. Platforms other than POSIX should follow a similar pattern if they need this. (VxWorks, I'm thinking of you.) Most sane platforms won't have an issue here, since normally these initializations do not need to allocate memory. (Reportedly, even FreeBSD has plans to "fix" this in libthr2.) While here, some bugs were fixed in initialization & teardown. The fallback code is properly tested with dedicated test cases.
* Fix a few problems found by codacy. One was a real bug.Garrett D'Amore2017-08-14
|
* Fence post error in queue drain, close or fini.Garrett D'Amore2017-08-11
|
* Remove some dead code; msg queue depths are always unsigned.Garrett D'Amore2017-08-07
|
* Refactor AIO logic to close numerous races and reduce complexity.Garrett D'Amore2017-08-04
| | | | | | | | | This passes valgrind 100% clean for both helgrind and deep leak checks. This represents a complete rethink of how the AIOs work, and much simpler synchronization; the provider API is a bit simpler to boot, as a number of failure modes have been simply eliminated. While here a few other minor bugs were squashed.
* Fix EAGAIN (timeout thread can run before we finish scheduling!)Garrett D'Amore2017-07-16
|
* AIO timeouts work correctly now, using their own timer logic.Garrett D'Amore2017-07-16
| | | | | | | | We closed a few subtle races in the AIO subsystem as well, and now we were able to eliminate the separate timer handling the MQ code. There appear to be some opportunities to further enhance the code for MQs as well -- eventually probably the only access to MQs will be with AIOs.
* Give up on uncrustify; switch to clang-format.Garrett D'Amore2017-07-10
|
* SRWLocks FTW!Garrett D'Amore2017-07-07
| | | | | | | | | Modern Windows (Vista and later) have light weight Slim Read/Write locks which only occupy 64 bits, and don't require any memory allocation to create. While here clean up a few more unreferenced variables found with the Microsoft compilers.
* Improved routines for list management.Garrett D'Amore2017-07-04
|
* Remove the unused infinite timeout versions of msgq.Garrett D'Amore2017-07-03
|
* Use common aio cancellation.Garrett D'Amore2017-07-02
|
* Refactor stop again, closing numerous races (thanks valgrind!)Garrett D'Amore2017-06-28
|
* Cancellation plumbing for message queues.Garrett D'Amore2017-06-27
|
* Put errors go on the putq.Garrett D'Amore2017-06-27
|
* Notification working - separate thread now.Garrett D'Amore2017-03-11
|
* Removing some dead code.Garrett D'Amore2017-03-11
|
* Surveyor pattern callback-driven.Garrett D'Amore2017-03-10
|
* Start of close related race fixes. Scalability test.Garrett D'Amore2017-03-10
|
* Pub/Sub now callback driven.Garrett D'Amore2017-03-06
|
* Pair protocol now callback driven.Garrett D'Amore2017-03-06
|
* Bus protocol now callback-driven.Garrett D'Amore2017-03-05
|
* Pipeline protocol now entirely callback driven.Garrett D'Amore2017-03-04
|
* Timer implementation. Operations can timeout now?Garrett D'Amore2017-03-03
|
* Start of msgq aio.Garrett D'Amore2017-03-01
|
* We don't need putback on message queues after all.Garrett D'Amore2017-02-18
|
* Event notification via pollable FDs verified working.Garrett D'Amore2017-01-22
|
* Fix synchronization problem in msgqueue with multiple consumers.Garrett D'Amore2017-01-19
|
* fixes #12 SURVEY hang in TravisGarrett D'Amore2017-01-18
|
* Recv/Send event plumbing implemented (msgqueue and up).Garrett D'Amore2017-01-16
| | | | | | | | This change provides for a private callback in the message queues, which can be used to notify the socket, and which than arranges for the appropriate event thread to run. Upper layer hooks to access this still need to be written.
* Initialize some mq vars.Garrett D'Amore2017-01-09
|
* Add survey test (and fix survey pattern).Garrett D'Amore2017-01-09
| | | | | | | As part of this, we've added a way to unblock callers in a message queue with an error, even without a signal channel. This was necessary to interrupt blockers upon survey timeout. They will get NNG_ETIMEDOUT, but afterwards callers get NNG_ESTATE.