summaryrefslogtreecommitdiff
path: root/src/protocol
Commit message (Collapse)AuthorAge
* fixes #484 crashes in websocket transportGarrett D'Amore2018-05-29
| | | | | | | | | | | | | | | | | | | | | | | fixes #490 posix_epdesc use-after-free bug fixes #489 Sanitizer based testing would help fixes #492 Numerous memory leaks found with sanitizer This introduces support for compiler-based sanitizers when using clang or gcc (and not on Windows). See NNG_SANITIZER for possible settings such as "thread" or "address". Furthermore, we have fixed the issues we found with both the thread and address sanitizers. We believe that the thread issues pointed to a low frequency use-after-free responsible for rare crashes in some of the tests. The tests generally have their timeouts doubled when running under a sanitizer, to account for the extra long times that the sanitizer can cause these to take. While here, we also changed the compat_ws test to avoid a particularly painful and time consuming DNS lookup, and we made the nngcat_unlimited test a bit more robust by waiting before sending traffic.
* fixes #459 SUB should be more aggressive about discarding messagesGarrett D'Amore2018-05-21
| | | | | | | | | As part of this code fix, we needed to add filtering support to the msgq_tryput code path -- it turns out that code path was bypassing the filterfn altogether. Eventually we'll remove all this filtering stuff from the msgq code and replace it with inline filtering directly in sub.
* fixes #449 Want more flexible pipe eventsGarrett D'Amore2018-05-17
| | | | | | | | | This changes the signature of nng_pipe_notify(), and the associated events. The documentation is updated to reflect this. We have also broken the lock up so that we don't hold the master socket lock for some of these things, which may have beneficial impact on performance.
* fixes #441 Unintentional semantic in bus protocolGarrett D'Amore2018-05-17
|
* fixes #440 leak in bus protocolGarrett D'Amore2018-05-16
|
* fixes #419 want to nni_aio_stop without blocking (#428)Garrett D'Amore2018-05-15
| | | | | | | | | | | | | | | | * fixes #419 want to nni_aio_stop without blocking This actually introduces an nni_aio_close() API that causes nni_aio_begin to return NNG_ECLOSED, while scheduling a callback on the AIO to do an NNG_ECLOSED as well. This should be called in non-blocking close() contexts instead of nni_aio_stop(), and the cases where we call nni_aio_fini() multiple times are updated updated to add nni_aio_stop() calls on all "interlinked" aios before finalizing them. Furthermore, we call nni_aio_close() as soon as practical in the close path. This closes an annoying race condition where the callback from a lower subsystem could wind up rescheduling an operation that we wanted to abort.
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.
* fixes #424 reqstress low frequency crashGarrett D'Amore2018-05-09
|
* fixes #342 Want Surveyor/Respondent context supportGarrett D'Amore2018-04-24
| | | | | | | | | | | | | | | | fixes #360 core should nng_aio_begin before nng_aio_finish_error fixes #361 nng_send_aio should check for NULL message fixes #362 nni_msgq does not signal pollable on certain events This adds support for contexts for both sides of the surveyor pattern. Prior to this commit, the raw mode was completely broken, and there were numerous other bugs found and fixed. This integration includes *much* deeper validation of this pattern. Some changes to the core and other patterns have been made, where it was obvioius that we could make such improvements. (The obviousness stemming from the fact that RESPONDENT in particular is very closely derived from REP.)
* fixes #368 context options could be emptyGarrett D'Amore2018-04-24
|
* fixes #346 nng_recv() sometimes acts on null `msg` pointerGarrett D'Amore2018-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This closes a fundamental flaw in the way aio structures were handled. In paticular, aio expiration could race ahead, and fire before the aio was properly registered by the provider. This ultimately led to the possibility of duplicate completions on the same aio. The solution involved breaking up nni_aio_start into two functions. nni_aio_begin (which can be run outside of external locks) simply validates that nni_aio_fini() has not been called, and clears certain fields in the aio to make it ready for use by the provider. nni_aio_schedule does the work to register the aio with the expiration thread, and should only be called when the aio is actually scheduled for asynchronous completion. nni_aio_schedule_verify does the same thing, but returns NNG_ETIMEDOUT if the aio has a zero length timeout. This change has a small negative performance impact. We have plans to rectify that by converting nni_aio_begin to use a locklesss flag for the aio->a_fini bit. While we were here, we fixed some error paths in the POSIX subsystem, which would have returned incorrect error codes, and we made some optmizations in the message queues to reduce conditionals while holding locks in the hot code path.
* fixes #355 Possible use after free in REPGarrett D'Amore2018-04-17
| | | | | | | | | While here I've added some code that should help us backtrack on a crash, by linking back to the pipe from the context when we are queued on that pipes sendq. I'm not sure if we've ever seen these or not, but it could explain certain infrequent crashes we think we've seen.
* fixes #334 Separate context for state machines from socketsGarrett D'Amore2018-04-10
| | | | | | | | | | | | | | | This provides context support for REQ and REP sockets. More discussion around this is in the issue itself. Optionally we would like to extend this to the surveyor pattern. Note that we specifically do not support pollable descriptors for non-default contexts, and the results of using file descriptors for polling (NNG_OPT_SENDFD and NNG_OPT_RECVFD) is undefined. In the future, it might be nice to figure out how to factor in optional use of a message queue for users who want more buffering, but we think there is little need for this with cooked mode.
* fixes #331 replace NNG_OPT_RAW option with constructorGarrett D'Amore2018-04-04
| | | | | | | | | | | | | This makes the raw mode something that is immutable, determined at socket construction. This is an enabling change for the separate context support coming soon. As a result, this is an API breaking change for users of the raw mode option (NNG_OPT_RAW). There aren't many of them out there. Cooked mode is entirely unaffected. There are changes to tests and documentation included.
* fixes #329 type checking not done for setoptGarrett D'Amore2018-04-04
|
* fixes #301 String option handling for getoptGarrett D'Amore2018-03-20
|
* fixes #296 Typed options should validate option typeGarrett D'Amore2018-03-20
| | | | | | | | | | | | | fixes #302 nng_dialer/listener/pipe_getopt_sockaddr desired This adds plumbing to pass and check the type of options all the way through. NNG_ZT_OPT_ORBIT is type UINT64, but you can use the untyped form to pass two of them if needed. No typed access for retrieving strings yet. I think this should allocate a pointer and copy that out, but that's for later.
* fixes #295 boolean options should use C99 bool typeGarrett D'Amore2018-03-18
| | | | | | | | | | | fixes #275 nng_pipe_getopt_ptr() missing? fixes #285 nng_setopt_ptr MIS fixes #297 nng_listener/dialer_close does not validate mode This change adds some missing APIs, and changes others. In particular, certain options are now of type bool, with size of just one. This is a *breaking* change for code that uses those options -- NNG_OPT_RAW, NNG_OPT_PAIR1_POLY, NNG_OPT_TLS_VERIFIED.
* fixes #240 nngcat is MIAGarrett D'Amore2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is intended to provide compatibility with, and has been tested against, legacy nanocat. There are a few differences though. At this time support for the alias names (where argv[0] is set to something like nngreq or somesuch) is missing. By default this library operations without NNG_FLAG_NONBLOCK on dial and listen, so that failures here are immediately diagnosable. (This behavior can be changed with the --async flag.) By default --pair means PAIRv1, but you can specify --pair0 or --pair1 explicitly. (There is also a --compat mode, and in that mode --pair means PAIRv0. The --compat mode also turns on NNG_FLAG_NONBLOCK by default.) The "quoted" mode also quotes tabs. (Legacy nanocat did not.) It is possible to connect to *multiple* peers by using the --dial or --listen (or similar) options multiple times. Shorthands can be used for long options that are not ambiguous. For example, --surv can be used to mean surveyor, but --re is invalid because it can mean req, rep, or respondent. We assume you have a reasonable standard C environment. This won't work in embedded environments without support for FILE *. TLS options are missing but to be added soon. A man page is still to be written.
* CMake & CPack improvements.Garrett D'Amore2018-02-21
| | | | | | | | | These are incremental updates... we avoid using install() in the subdirectories, so that we can adapt properly to them in the single parent directory. We have started some of the work to improve support for CPack. This is still not yet done, but work in progress.
* fixes #234 Investigate enabling more verbose compiler warningsGarrett D'Amore2018-02-14
| | | | | | | We enabled verbose compiler warnings, and found a lot of issues. Some of these were even real bugs. As a bonus, we actually save some initialization steps in the compat layer, and avoid passing some variables we don't need.
* fixes #173 Define public HTTP server APIGarrett D'Amore2018-02-01
| | | | | | | | | | | | | | | | | | | | | | | This introduces enough of the HTTP API to support fully server applications, including creation of websocket style protocols, pluggable handlers, and so forth. We have also introduced scatter/gather I/O (rudimentary) for aios, and made other enhancements to the AIO framework. The internals of the AIOs themselves are now fully private, and we have eliminated the aio->a_addr member, with plans to remove the pipe and possibly message members as well. A few other minor issues were found and fixed as well. The HTTP API includes request, response, and connection objects, which can be used with both servers and clients. It also defines the HTTP server and handler objects, which support server applications. Support for client applications will require a client object to be exposed, and that should be happening shortly. None of this is "documented" yet, bug again, we will follow up shortly.
* fixes #196 surveyor pattern hangs after second surveyGarrett D'Amore2018-01-09
|
* fixes #147 surveyor protocol needs NNG_OPT_MAXTTLGarrett D'Amore2017-11-03
|
* fixes #143 Protocols and transports should be "configurable"Garrett D'Amore2017-11-02
| | | | | | | | | | | | | | | | | | | | This makes all the protocols and transports optional. All of them except ZeroTier are enabled by default, but you can now disable them (remove from the build) with cmake options. The test suite is modified so that tests still run as much as they can, but skip over things caused by missing functionality from the library (due to configuration). Further, the constant definitions and prototypes for functions that are specific to transports or protocols are moved into appropriate headers, which should be included directly by applications wishing to use these. We have also added and improved documentation -- all of the transports are documented, and several more man pages for protocols have been added. (Req/Rep and Surveyor are still missing.)
* fixes #137 Remove public access to numeric protocolsGarrett D'Amore2017-10-31
|
* fixes #45 expose aio to applicationsGarrett D'Amore2017-10-25
| | | | | | | | | | While here we added a test for the aio stuff, and cleaned up some dead code for the old fd notifications. There were a few improvements to shorten & clean code elsewhere, such as short-circuiting task wait when the task has no callback. The legacy sendmsg() and recvmsg() APIs are still in the socket core until we convert the device code to use the aios.
* fixes #112 Need to move some stuff from socket to message queuesGarrett D'Amore2017-10-23
|
* fixes #123 latency tests with wrong message sizeGarrett D'Amore2017-10-19
|
* fixes #84 Consider using msec for durationsGarrett D'Amore2017-10-19
| | | | | | There is now a public nng_duration type. We have also updated the zerotier work to work with the signed int64_t's that the latst ZeroTier dev branch is using.
* fixes #99 Still seeing segfaults in pair1 (rarely)Garrett D'Amore2017-10-03
|
* fixes #85 Protocols need to set msg pipeGarrett D'Amore2017-09-27
| | | | | | | We looked at other options, but this is the least intrusive, even though it means that the protocols have to set it up. The reason is that transports have different methods of receiving messages, and there is no framework code between the transport and the protocol.
* Remove last vestiges of integer option numbers.Garrett D'Amore2017-09-27
|
* Refactor option handling APIs.Garrett D'Amore2017-09-27
| | | | | | | | | | | | This makes the APIs use string keys, and largely eliminates the use of integer option IDs altogether. The underlying registration for options is also now a bit richer, letting protcols and transports declare the actual options they use, rather than calling down into each entry point carte blanche and relying on ENOTSUP. This code may not be as fast as the integers was, but it is more intuitive, easier to extend, and is not on any hot code paths. (If you're diddling options on a hot code path you're doing something wrong.)
* More pipe option handling, pipe API support. Url option.Garrett D'Amore2017-09-22
| | | | | | | | | | This fleshes most of the pipe API out, making it available to end user code. It also adds a URL option that is independent of the address options (which would be sockaddrs.) Also, we are now setting the pipe for req/rep. The other protocols need to have the same logic added to set the receive pipe on the message. (Pair is already done.)
* Add improved getopt functions, pass integers by value.Garrett D'Amore2017-09-22
|
* Allocate AIOs dynamically.Garrett D'Amore2017-09-22
| | | | | | | | | | | | | | We allocate AIO structures dynamically, so that we can use them abstractly in more places without inlining them. This will be used for the ZeroTier transport to allow us to create operations consisting of just the AIO. Furthermore, we provide accessors for some of the aio members, in the hopes that we will be able to wrap these for "safe" version of the AIO capability to export to applications, and to protocol and transport implementors. While here we cleaned up the protocol details to use consistently shorter names (no nni_ prefix for static symbols needed), and we also fixed a bug in the surveyor code.
* Eliminate legacy option settings, provide easier option IDs.Garrett D'Amore2017-08-24
| | | | | | | | | | | | | | | | | | This eliminates all the old #define's or enum values, making all option IDs now totally dynamic, and providing well-known string values for well-behaved applications. We have added tests of some of these options, including lookups, and so forth. We have also fixed a few problems; including at least one crasher bug when the timeouts on reconnect were zero. Protocol specific options are now handled in the protocol. We will be moving the initialization for a few of those well known entities to the protocol startup code, following the PAIRv1 pattern, later. Applications must therefore not depend on the value of the integer IDs, at least until the application has opened a socket of the appropriate type.
* Add init/fini to protocols to allow them to register options.Garrett D'Amore2017-08-23
|
* Implement dynamic option numbering.Garrett D'Amore2017-08-23
| | | | | | | This permits option numbers to be allocated based on string name. Eventually all the option values will be replaced with option names. This will facilitate transports (ZeroTier) that may need further options.
* Provide versions of mutex, condvar, and aio init that never fail.Garrett D'Amore2017-08-16
| | | | | | | | | | | | | | | | | | | | | | | If the underlying platform fails (FreeBSD is the only one I'm aware of that does this!), we use a global lock or condition variable instead. This means that our lock initializers never ever fail. Probably we could eliminate most of this for Linux and Darwin, since on those platforms, mutex and condvar initialization reasonably never fails. Initial benchmarks show little difference either way -- so we can revisit (optimize) later. This removes a lot of otherwise untested code in error cases and so forth, improving coverage and resilience in the face of allocation failures. Platforms other than POSIX should follow a similar pattern if they need this. (VxWorks, I'm thinking of you.) Most sane platforms won't have an issue here, since normally these initializations do not need to allocate memory. (Reportedly, even FreeBSD has plans to "fix" this in libthr2.) While here, some bugs were fixed in initialization & teardown. The fallback code is properly tested with dedicated test cases.
* Convert duration to usec.Garrett D'Amore2017-08-14
|
* REP drops peers a little too aggressively.Garrett D'Amore2017-08-14
| | | | | | | | We noticed that certain failure modes were exposed in tests that were caused by us closing the underlying pipe when certain messaging errors occurred. Discarding the pipe is the wrong answer; instead we should discard the message and keep the pipe open (unless the message is so malformed that the remote party cannot be trusted.)
* Fix leaks found in pairv1 test suite.Garrett D'Amore2017-08-11
|
* Fail to another stream on default (no pipe requested).Garrett D'Amore2017-08-11
|
* Fix TTL handling for pair1 protocol; more tests.Garrett D'Amore2017-08-11
|
* Test support for pairv1 including polyamorous mode.Garrett D'Amore2017-08-11
|
* Unify the msg API.Garrett D'Amore2017-08-10
| | | | | | | | | | | | | This makes the operations that work on headers start with nni_msg_header or nng_msg_header. It also renames _trunc to _chop (same strlen as _trim), and renames prepend to insert. We add a shorthand for clearing message content, and make better use of the endian safe 32-bit accessors too. This also fixes a bug in inserting large headers into messages. A test suite for message handling is included.
* Add new PAIR_V1 protocol.Garrett D'Amore2017-08-10
| | | | | | | | | | The PAIR_V1 protocol supports both raw and cooked modes, and has loop prevention included. It also has a polyamorous mode, wherein it allows multiple connections to be established. In polyamorous mode (set by an option), the sender requests a paritcular pipe by setting it on the message. We default to PAIR_V1 now.
* fixes #44 open protocol by "name" (symbol) instead numberGarrett D'Amore2017-08-09
| | | | | | | | | | | | | | fixes #38 Make protocols "pluggable", or at least optional This is a breaking change, as we've done away with the central registered list of protocols, and instead demand the user call nng_xxx_open() where xxx is a protocol name. (We did keep a table around in the compat framework though.) There is a nice way for protocols to plug in via an nni_proto_open(), where they can use a generic constructor that they use to build a protocol specific constructor (passing their ops vector in.)