aboutsummaryrefslogtreecommitdiff
path: root/src/protocol/reqrep0
Commit message (Collapse)AuthorAge
* fixes #1345 Restructure the source treeGarrett D'Amore2021-01-01
| | | | | This is not quite complete, but it sets the stage for other protocols (such as zmq or mqtt) to be added to the project.
* fixes #1386 remove NNI_PROTO_FLAG_NOMSGQGarrett D'Amore2020-12-27
|
* New NUTS test framework (NNG Unit Test Support).Garrett D'Amore2020-11-23
| | | | | | | | | | | | | This is based on testutil/acutest, but is cleaner and fixes some short-comings. We will be adding more support for additional common paradigms to better facilitate transport tests. While here we added some more test cases, and fixed a possible symbol collision in the the stats framework (due to Linux use of a macro definition of "si_value" in a standard OS header). Test coverage may regress slightly as we are no longer using some of the legacy APIs.
* Work for test refactoring.Garrett D'Amore2020-11-18
| | | | | | | | | | | | | | | | | | | | | | There are a few major areas in this change. * CMake options are now located in a common cmake/NNGOptions.cmake file. This should make it easier for folks to figure out what the options are, and how they are used. * Tests are now scoped with their directory name, which should avoid possible name collisions with test names. * A number of tests have been either moved or incorporated into the newer testutil/acutest framework. We are moving away from my old c-convey framework to something easier to debug. * We use CMake directories a bit more extensively leading to a much cleaner CMake structure. It's not complete, but a big step in the right direction, and a preview of future work. * Tests are now run with verbose flags, so we get more test results in the CI/CD logs.
* Minor spelling tweaks for the aio framework.Garrett D'Amore2020-11-09
|
* Add callback form of receive test for rep.Garrett D'Amore2020-11-09
|
* Add test case for close of socket with pending REP.Garrett D'Amore2020-11-08
|
* fixes #1300 REQ should fast fail if retry timer is zero (on disconnect)Garrett D'Amore2020-10-31
|
* fixes #1304 Non-blocking send on rep sockets always fails (#1305)Garrett D'Amore2020-10-25
|
* fixes #1289 zerotier should have it's own copy of the id hashing codeGarrett D'Amore2020-08-16
| | | | | | | | | | | fixes #1288 id allocation can overallocate fixes #1126 consider removing lock from idhash This substantially refactors the id hash code, giving a cleaner API, and eliminating a extra locking as well as some wasteful allocations. The ZeroTier code has it's own copy, that is 64-bit friendly, as the rest of the consumers need only a simpler 32-bit API.
* fixes #1241 SIGSEGV in RepReq's rep0 recv - use after freeGarrett D'Amore2020-05-25
| | | | | This also affects the respondent protocol. Examination of the other protocols did not turn up any evidence of the same issue.
* fixes #1171 message header could be inlined in the message structureGarrett D'Amore2020-02-26
| | | | | This uses a maximum 64-byte header and should avoid allocations and cache misses, leading to a small performance boost overall.
* Survey test rewrite.Garrett D'Amore2020-02-05
| | | | | This bumps the coverage for survey up. While here fixed a few nits in req test, and removed the now pointless legacy survey and respond tests.
* fixes #1169 survey and xsurvey could use message cloningGarrett D'Amore2020-01-20
| | | | | | | | | | | fixes #1160 Consider limiting maximum hop count to 15 fixes #1098 Maximum maxTTL should be compile time defined This doesn't expose the max-MaxTTL in the CMakeList.txt -- there is really no reason anyone should be changing it. This does not yet inline the message header into the nni_msg_t, but it is my intention to do so soon, and eliminate most of the conditional cases for failure on inserting into the header.
* fixes #1162 req_test needs to have better retry testsGarrett D'Amore2020-01-20
|
* fixes #1163 compat tests are very brittleGarrett D'Amore2020-01-20
| | | | | | | This only addresses the newly rewitten compat_tcp test, but it sets the groundwork for the other tests, so that when they are updated to the new acutest.h they can use the new marry code to establish connections cleanly and safely.
* fixes #1156 Message cloning could help reduce copies a lotGarrett D'Amore2020-01-20
| | | | | | | | | | | | This introduces reference counting on messages to reduce the data copies. This should have a marked improvement when moving large messages through the system, or when publishing to many subscribers. For some transports, when using large messages, the copy time can be the dominant factor. Note that when a message is actually shared, inproc will still perform an extra copy in order to ensure that it can modify the headers. This will unfortunately always be the case with REQ, as the REQ protocol keeps a copy of the original message so it can retry.
* fixes #1154 perf programs could use options to select different protocolsGarrett D'Amore2020-01-19
| | | | | | | | | | | This allows --reqrep0, --pair0, --pair1, and --bus0 to be used with the inproc_lat, and also inproc_thr (though only with pair options for that for now). It also introduced --url for both programs to support testing over different transports. Also, we no longer pass out the header for REQ reply -- that is an error, and led to some unfortunate failures when reusing the message.
* fixes #1142 raw mode use of message headers is inconsistentGarrett D'Amore2020-01-18
| | | | | | | | This correctly moves the entire protocol header for XREQ and XRESPONDENT protocols to the message header (not the body). This is where it should always have been. There is some small chance that applications which were coded to parse the header from the body will break. We don't think there are any such applications in use.
* TTL reads could be fewer.Garrett D'Amore2020-01-18
| | | | | | | Specifically, we don't need to read the atomic value each loop iteration. We can just get it when a message is first received, and then use that value. This should make receiving through proxies a little more efficient.
* Give time for REQ backpressure to ease in test suite.Garrett D'Amore2020-01-13
|
* Test coverage improvements for REQ/REP.Garrett D'Amore2020-01-12
| | | | | | | | This also fixes a possible bug if mixing poll file descriptors and contexts on the same socket. Most folks are unlikely to ever run into this bug. At this point the REQ/REP coverage is nearly complete (over 95%).
* XREQ and others race on TTL.Garrett D'Amore2020-01-11
| | | | | | | | | | The TTL in these cases should have been atomic. To facilitate things we actually introduce an atomic int for convenience. We also introduce a convenience nni_msg_must_append_u32() and nni_msg_header_must_append_u32(), so that we can eliminate some failure tests that cannot ever happen. Combined with a new test for xreq, we have 100% coverage for xreq and more coverage for the other REQ/REP protocols.
* fixes #1124 data race on ttl in xrepGarrett D'Amore2020-01-09
|
* fixes #1094 Consider in-lining task and aioGarrett D'Amore2020-01-08
| | | | | This only does it for rep, but it also has changes that should increase the overall test coverage for the REP protocol
* fixes #1105 pollable can be inlined, and use atomicsGarrett D'Amore2020-01-04
| | | | | This also introduces an nni_atomic_cas64 to help with lock-free designs. Some mechanical renaming was done in some of the protocols for spelling.
* fixes #1104 move allocation of protocol objects to common coreGarrett D'Amore2020-01-03
| | | | fixes #1103 respondent could inline backtrace
* More REP protocol coverage tests.Garrett D'Amore2020-01-02
|
* fixes #1099 Should inline backtrace buffer in REPGarrett D'Amore2020-01-02
|
* fixes #1088 REP protocol does not signal SENDFD properlyGarrett D'Amore2020-01-01
| | | | | We've also added some TEST_NNG_SEND_STR and TEST_NNG_RECV_STR to help with convenience when writing test code.
* fixes #1081 Use after free possible in statsGarrett D'Amore2020-01-01
| | | | | | | fixes #1080 Desire better way to access statistics for NNG objects We've also added a test that uses some of this, in order to verify that the req protocol rejects invalid peers.
* fixes #1065 resolver leaks work structuresGarrett D'Amore2019-12-29
| | | | | | This includes changes to support setting the sanitizer *correctly* (the old code CMake stuff didn't quite get it right), and addresses a number of failures in the test code found by the address sanitizer.
* fixes #1057 reqpoll test fails (bad test logic) sometimesGarrett D'Amore2019-12-27
| | | | The reqpoll test is now moved into the common req/rep logic.
* fixes #1040 Convert rest of the protocols to new CMake infraGarrett D'Amore2019-12-25
|
* fixes #1032 Figure out Darwin bustednessGarrett D'Amore2019-12-24
| | | | | | | | | | | | | | | | | | | | | | | | fixes #1035 Convey is awkward -- consider acutest.h This represents a rather large effort towards cleaning up our testing and optional configuration infrastructure. A separate test library is built by default, which is static, and includes some useful utilities design to make it easier to write shorter and more robust (not timing dependent) tests. This also means that we can cover pretty nearly all the tests (protocols etc.) in every case, even if the shipped image will be minimized. Subsystems which are optional can now use a few new macros to configure what they need see nng_sources_if, nng_headers_if, and nng_defines_if. This goes a long way to making the distributed CMakefiles a lot simpler. Additionally, tests for different parts of the tree can now be located outside of the tests/ tree, so that they can be placed next to the code that they are testing. Beyond the enabling work, the work has only begun, but these changes have resolved the most often failing tests for Darwin in the cloud.
* fixes #857 NNG_OPT_REQ_RESENDTIME does not honor NNG_DURATION_INFINITEGarrett D'Amore2019-02-17
|
* fixes #871 panic when sharing rep between threadsGarrett D'Amore2019-02-17
|
* fixes #831 Unify option structures, o_type is unusedGarrett D'Amore2018-12-29
|
* move all public headers to include/nng/ folderGregor Burger2018-11-22
| | | | | | | | | | This change makes embedding nng + nggpp (or other projects depending on nng) in cmake easier. The header files are moved to a separate include directory. This also makes installation of the headers easier, and allows clearer identification of private vs public heade files. Some additional cleanups were performed by @gedamore, but the main credit for this change belongs with @gregorburger.
* fixes #577 target library dependencies should be publicGarrett D'Amore2018-11-05
| | | | | | | | | | | This is a significant refactor of the library configuration. We use the modern package configuration helper, with a template script that also does the find_package dance for any of our dependencies. We also have restructured the code so that most protocols and transports have their configuration isolated to their own CMakeLists file, reducing the size of the global CMakeLists file.
* fixes #664 aio cancellation could be betterGarrett D'Amore2018-08-20
| | | | | | | | | This changes the signature of the aio cancellation routines to take the argument for cancellation directly, so we do not need to lookup the argument using the nni_aio_get_prov_data. We should probably consider eliminating nni_aio_get_prov_data, and co, and changing the prov_extra to reflect prov_data. Later.
* fixes #648 REQ protocol can hang on closeGarrett D'Amore2018-08-14
| | | | | | | | | | | Actually the problem was in socket core, in particular in the shutdown code. The socket shutdown is supposed to ensure that no pipes were present on the socket, so that protocols need not concern themselves with this. The code unfortunately was busted, due to an ordering problem compounded by a race condition. This fixes that, and changes the REQ protocol to avoid the blocking condition altogether, and sprinkles a few assertions to validate these rules are being adhered to.
* fixes #628 Hang in closing REQGarrett D'Amore2018-08-07
| | | | | | This adds a proper boolean condition for the pipe being closed (removing the unused sending flag), and adds checks for both the pipe closed and the socket closed flags at key points.
* fixes #568 Want a single reader/write lock on socket child objectsGarrett D'Amore2018-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #170 Make more use of reaper This is a complete restructure/rethink of how child objects interact with the socket. (This also backs out #576 as it turns out not to be needed.) While 568 says reader/writer lock, for now we have settled for a single writer lock. Its likely that this is sufficient. Essentially we use the single socket lock to guard lists of the socket children. We also use deferred deletion in the idhash to facilitate teardown, which means endpoint closes are no longer synchronous. We use the reaper to clean up objects when the reference count drops to zero. We make a special exception for pipes, since they really are not reference counted by their parents, and they are leaf objects anyway. We believe this addresses the main outstanding race conditions in a much more correct and holistic way. Note that endpoint shutdown is a little tricky, as it makes use of atomic flags to guard against double entry, and against recursive lock entry. This is something that would be nice to make a bit more obvious, but what we have is safe, and the complexity is at least confined to one place.
* fixes #572 Several locking errors foundGarrett D'Amore2018-07-03
| | | | | | | | | | fixes #573 atomic flags could help This introduces a new atomic flag, and reduces some of the global locking. The lock refactoring work is not yet complete, but this is a positive step forward, and should help with certain things. While here we also fixed a compile warning due to incorrect types.
* fixes #540 nni_ep_opttype serves no purposeGarrett D'Amore2018-06-13
| | | | | | | | | | | | fixes #538 setopt should have an explicit chkopt routine fixes #537 Internal TCP API needs better name separation fixes #524 Option types should be "typed" This is a rework of the option management code, to make it both clearer and to prepare for further work to break up endpoints. This reduces a certain amount of dead or redundant code, and actually saves cycles when setting options, as some loops were not terminated that should have been.
* fixes #484 crashes in websocket transportGarrett D'Amore2018-05-29
| | | | | | | | | | | | | | | | | | | | | | | fixes #490 posix_epdesc use-after-free bug fixes #489 Sanitizer based testing would help fixes #492 Numerous memory leaks found with sanitizer This introduces support for compiler-based sanitizers when using clang or gcc (and not on Windows). See NNG_SANITIZER for possible settings such as "thread" or "address". Furthermore, we have fixed the issues we found with both the thread and address sanitizers. We believe that the thread issues pointed to a low frequency use-after-free responsible for rare crashes in some of the tests. The tests generally have their timeouts doubled when running under a sanitizer, to account for the extra long times that the sanitizer can cause these to take. While here, we also changed the compat_ws test to avoid a particularly painful and time consuming DNS lookup, and we made the nngcat_unlimited test a bit more robust by waiting before sending traffic.
* fixes #449 Want more flexible pipe eventsGarrett D'Amore2018-05-17
| | | | | | | | | This changes the signature of nng_pipe_notify(), and the associated events. The documentation is updated to reflect this. We have also broken the lock up so that we don't hold the master socket lock for some of these things, which may have beneficial impact on performance.
* fixes #419 want to nni_aio_stop without blocking (#428)Garrett D'Amore2018-05-15
| | | | | | | | | | | | | | | | * fixes #419 want to nni_aio_stop without blocking This actually introduces an nni_aio_close() API that causes nni_aio_begin to return NNG_ECLOSED, while scheduling a callback on the AIO to do an NNG_ECLOSED as well. This should be called in non-blocking close() contexts instead of nni_aio_stop(), and the cases where we call nni_aio_fini() multiple times are updated updated to add nni_aio_stop() calls on all "interlinked" aios before finalizing them. Furthermore, we call nni_aio_close() as soon as practical in the close path. This closes an annoying race condition where the callback from a lower subsystem could wind up rescheduling an operation that we wanted to abort.
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.