| Commit message (Collapse) | Author | Age |
| |
|
|
|
| |
This is not quite complete, but it sets the stage for other
protocols (such as zmq or mqtt) to be added to the project.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a few major areas in this change.
* CMake options are now located in a common cmake/NNGOptions.cmake
file. This should make it easier for folks to figure out what
the options are, and how they are used.
* Tests are now scoped with their directory name, which should
avoid possible name collisions with test names.
* A number of tests have been either moved or incorporated into
the newer testutil/acutest framework. We are moving away from
my old c-convey framework to something easier to debug.
* We use CMake directories a bit more extensively leading to a much
cleaner CMake structure. It's not complete, but a big step in the
right direction, and a preview of future work.
* Tests are now run with verbose flags, so we get more test results
in the CI/CD logs.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a sweeping cleanup of the transport logic around options,
and also harmonizes the names used when setting or getting options.
Additionally, legacy methods are now moved into a separate file and
can be elided via CMake or a preprocessor define.
Fundamentally, the ability to set to transport options via the socket
is deprecated; there are numerous problems with this and my earlier
approaches to deal with this have been somewhat misguided. Further
these approaches will not work with future protocol work that is
planned (were some options need to be negotiated with peers at the
time of connection establishment.)
Documentation has been updated to reflect this. The test suites still
make rather broad use of the older APIs, and will be converted later.
|
| |
|
|
|
|
| |
This should reduce the amount of copying, and the overall size
used by pipes and other objects quite a bit. (On my system, the
sizeof nni_pipe shrank by 400 bytes, for example.)
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
fixes #1317 IPv6 listener get port is incorrect
fixes #1319 Want symbolic service names
This is phase 1 of reducing the memory foot-print of aios, and
also of pipes. This removes the largest consumer the socket
address information, from the aio, which was only used by a few
consumers.
|
| |
|
|
|
| |
Also, addressed a number of Clang-tidy complaints. Potential hangs
in close addressed as well.
|
| |
|
|
|
|
|
|
| |
This introduces support for an external wolfSSL plugin, and generally
creates the framework for pluggable TLS implementations.
The wolfSSL engine is provided via an external module (git submodule),
available either under a GPLv3 license or a commercial license.
|
| |
|
|
|
| |
This only does it for rep, but it also has changes that should increase
the overall test coverage for the REP protocol
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #698 Need TCP stats
fixes #699 Need IPC stats
fixes #701 Need TLS stats
This commit addresses a problem when negotiating using one of the stream
based negotiation APIs -- a slow or misbehaving peer can prevent well
behaved ones from establishing a connection. The fix is a fairly
significant change in how these transports link up, and it does rely
on the fact that the socket only has a single accept() or connect()
pending at a time (on a given endpoint that is).
While here, we have completely revamped the way transport statistics are
done, offering a standard API for collecting these statistics.
Unfortunately, this completely borks the statistics for inproc. As we
are planning to change the way inproc works soon, in order to provide
more control and work on performance fixes for the message queue, we feel
this is an acceptable trade-off. Furthermore, almost nobody uses inproc
for anything, and even fewer people are making use of the statistics
at this time.
|
| |
|
|
|
|
| |
We also have made some support changes, including new APIs for printing
URLs, and some improvements to the NNG_OPT_URL to make use of this new
property.
|
| |
|
|
|
|
|
|
|
| |
This is a major change, and includes changes to use a polymorphic
stream API for all transports. There have been related bugs fixed
along the way. Additionally the man pages have changed.
The old non-polymorphic APIs are removed now. This is a breaking
change, but the old APIs were never part of any released public API.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This changes much of the internal API for TCP option handling, and
includes hooks for some of this in various consumers. Note that the
consumers still need to have additional work done to complete them,
which will be part of providing public "raw" TLS and WebSocket APIs.
We would also like to finish addressing the call sites of
nni_tcp_listener_start() that assume the sockaddr is modified --
it would be superior to use the NNG_OPT_LOCADDR option. Thaat will be
addressed in a follow up PR.
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
This change makes embedding nng + nggpp (or other projects depending on
nng) in cmake easier. The header files are moved to a separate include
directory. This also makes installation of the headers easier, and
allows clearer identification of private vs public heade files.
Some additional cleanups were performed by @gedamore, but the main
credit for this change belongs with @gregorburger.
|
| |
|
|
| |
fixes #776 Configuration of mbedTLS should warn about license
|
| |
|
|
|
|
|
|
|
|
|
| |
This is a significant refactor of the library configuration.
We use the modern package configuration helper, with a template
script that also does the find_package dance for any of our
dependencies.
We also have restructured the code so that most protocols and
transports have their configuration isolated to their own CMakeLists
file, reducing the size of the global CMakeLists file.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
| |
This changes the signature of the aio cancellation routines
to take the argument for cancellation directly, so we do not
need to lookup the argument using the nni_aio_get_prov_data.
We should probably consider eliminating nni_aio_get_prov_data,
and co, and changing the prov_extra to reflect prov_data. Later.
|
| | |
|
| |
|
|
|
| |
This changeset needs work. We are seeing errors described by
This reverts commit d7f7c896c0ede24249ef63b1e45b1878bf4bd473.
|
| |
|
|
|
|
|
|
|
|
| |
fixes #208 pipe start should occur before connect / accept
fixes #616 Race condition closing between header & body
This refactors the transports to handle their own connection
handshaking before passing the pipe to the socket. This
changes and simplifies the setup. This also fixes a rather
challenging race condition described by #616.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #596 POSIX IPC should move away from pipedesc/epdesc
fixes #598 TLS and TCP listeners could support NNG_OPT_LOCADDR
fixes #594 Windows IPC should use "new style" win_io code.
fixes #597 macOS could support PEER PID
This large change set cleans up the IPC support on Windows and
POSIX. This has the beneficial impact of significantly reducing
the complexity of the code, reducing locking, increasing
concurrency (multiple dial and accepts can be outstanding now),
reducing context switches (we complete thins synchronously now).
While here we have added some missing option support, and fixed a
few more bugs that we found in the TCP code changes from last week.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #179 DNS resolution should be done at connect time
fixes #586 Windows IO completion port work could be better
fixes #339 Windows iocp could use synchronous completions
fixes #280 TCP abstraction improvements
This is a rather monstrous set of changes, which refactors TCP, and
the underlying Windows I/O completion path logic, in order to obtain
a cleaner, simpler API, with support for asynchronous DNS lookups performed
on connect rather than initialization time, the ability to have multiple
connects or accepts pending, as well as fewer extraneous function calls.
The Windows code also benefits from greatly reduced context switching,
fewer lock operations performed, and a reduced number of system calls
on the hot code path. (We use automatic event resetting instead of manual.)
Some dead code was removed as well, and a few potential edge case leaks
on failure paths (in the websocket code) were plugged.
Note that all TCP based transports benefit from this work. The IPC code
on Windows still uses the legacy IOCP for now, as does the UDP code (used
for ZeroTier.) We will be converting those soon too.
|
| |
|
|
|
|
|
|
|
|
| |
fixes #573 atomic flags could help
This introduces a new atomic flag, and reduces some of the global
locking. The lock refactoring work is not yet complete, but this is
a positive step forward, and should help with certain things.
While here we also fixed a compile warning due to incorrect types.
|
| |
|
|
|
|
|
|
|
|
| |
This separates the plumbing for endpoints into distinct
dialer and listeners. Some of the transports could benefit
from further separation, but we've done some rather larger
separation e.g. for the websocket transport.
IPC would be a good one to update later, when we start looking
at exposing a more natural underlying API.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
fixes #538 setopt should have an explicit chkopt routine
fixes #537 Internal TCP API needs better name separation
fixes #524 Option types should be "typed"
This is a rework of the option management code, to make it both clearer
and to prepare for further work to break up endpoints. This reduces
a certain amount of dead or redundant code, and actually saves cycles
when setting options, as some loops were not terminated that should have
been.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #490 posix_epdesc use-after-free bug
fixes #489 Sanitizer based testing would help
fixes #492 Numerous memory leaks found with sanitizer
This introduces support for compiler-based sanitizers when using
clang or gcc (and not on Windows). See NNG_SANITIZER for possible
settings such as "thread" or "address".
Furthermore, we have fixed the issues we found with both the
thread and address sanitizers. We believe that the thread issues
pointed to a low frequency use-after-free responsible for rare
crashes in some of the tests.
The tests generally have their timeouts doubled when running under
a sanitizer, to account for the extra long times that the sanitizer
can cause these to take.
While here, we also changed the compat_ws test to avoid a particularly
painful and time consuming DNS lookup, and we made the nngcat_unlimited
test a bit more robust by waiting before sending traffic.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #429 async websocket reap leads to crash
This tightens up the code for shutdown, ensuring that transport
callbacks are completely stopped before advancing to the next step
of teardown of transport pipes or endpoints.
It also fixes a problem where task_wait would sometimes get "stuck"
as tasks transitioned between asynch and synchronous completions.
Finally, it saves a few cycles by only calling a cancellation callback
once during cancellation of an aio.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* fixes #419 want to nni_aio_stop without blocking
This actually introduces an nni_aio_close() API that causes
nni_aio_begin to return NNG_ECLOSED, while scheduling a callback
on the AIO to do an NNG_ECLOSED as well. This should be called
in non-blocking close() contexts instead of nni_aio_stop(), and
the cases where we call nni_aio_fini() multiple times are updated
updated to add nni_aio_stop() calls on all "interlinked" aios before
finalizing them.
Furthermore, we call nni_aio_close() as soon as practical in the
close path. This closes an annoying race condition where the
callback from a lower subsystem could wind up rescheduling an
operation that we wanted to abort.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #326 consider nni_taskq_exec_synch()
fixes #410 kqueue implementation could be smarter
fixes #411 epoll_implementation could be smarter
fixes #426 synchronous completion can lead to panic
fixes #421 pipe close race condition/duplicate destroy
This is a major refactoring of two significant parts of the code base,
which are closely interrelated.
First the aio and taskq framework have undergone a number of simplifications,
and improvements. We have ditched a few parts of the internal API (for
example tasks no longer support cancellation) that weren't terribly useful
but added a lot of complexity, and we've made aio_schedule something that
now checks for cancellation or other "premature" completions. The
aio framework now uses the tasks more tightly, so that aio wait can
devolve into just nni_task_wait(). We did have to add a "task_prep()"
step to prevent race conditions.
Second, the entire POSIX poller framework has been simplified, and made
more robust, and more scalable. There were some fairly inherent race
conditions around the shutdown/close code, where we *thought* we were
synchronizing against the other thread, but weren't doing so adequately.
With a cleaner design, we've been able to tighten up the implementation
to remove these race conditions, while substantially reducing the chance
for lock contention, thereby improving scalability. The illumos poller
also got a performance boost by polling for multiple events.
In highly "busy" systems, we expect to see vast reductions in lock
contention, and therefore greater scalability, in addition to overall
improved reliability.
One area where we currently can do better is that there is still only
a single poller thread run. Scaling this out is a task that has to be done
differently for each poller, and carefuly to ensure that close conditions
are safe on all pollers, and that no chance for deadlock/livelock waiting
for pfd finalizers can occur.
|
| |
|
|
|
|
|
| |
fixes #249 nngcat needs test cases
fixes #416 transports do not permit unlimited message size with 0
fixes #417 nngcat truncates input files to 4k
fixes #348 nngcat should have switch to adjust maximum receive size
|
| |
|
|
| |
fixes #106 TCP keepalive tuning
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This closes a fundamental flaw in the way aio structures were
handled. In paticular, aio expiration could race ahead, and
fire before the aio was properly registered by the provider.
This ultimately led to the possibility of duplicate completions
on the same aio.
The solution involved breaking up nni_aio_start into two functions.
nni_aio_begin (which can be run outside of external locks) simply
validates that nni_aio_fini() has not been called, and clears certain
fields in the aio to make it ready for use by the provider.
nni_aio_schedule does the work to register the aio with the expiration
thread, and should only be called when the aio is actually scheduled
for asynchronous completion. nni_aio_schedule_verify does the same thing,
but returns NNG_ETIMEDOUT if the aio has a zero length timeout.
This change has a small negative performance impact. We have plans to
rectify that by converting nni_aio_begin to use a locklesss flag for
the aio->a_fini bit.
While we were here, we fixed some error paths in the POSIX subsystem,
which would have returned incorrect error codes, and we made some
optmizations in the message queues to reduce conditionals while holding
locks in the hot code path.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ultimately, this just removes the support for lingering altogether.
Based on prior experience, lingering has always been unreliable, and
was removed in legacy libnanomsg ages ago.
The problem is that operating system support for lingering is very
inconsistent at best, and for some transports the very concept is somewhat
meaningless.
Making things worse, we were never able to adequately capture an exit()
event from another thread -- so lingering was always a false promise.
Applications that need to be sure that messages are delivered should
either include an ack in their protocol, use req/rep (which has an ack),
or inject a suitable delay of their own.
For things going over local networks, an extra delay of 100 msec should
be sufficient *most of the time*.
|
| | |
|
| |
|
|
|
|
|
|
| |
fixes #325 synchronous aio completion crash
fixes #327 move nni_clock() operations to outside the nni_aio_lk.
This work was done for the context tree, and is necessary to properly
enable that branch.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #22 Consider using synchronous completions sometimes
Transport improvements for IPC, TCP, and TLS.
This change does three things.
First it permits multiple outstanding receives or sends on the transport.
This change is being made to accomodate some other changes in the protocols
where it might be advantageous to post send or receives directly against
the transport pipe without going through another level of indirection.
Second, it changes the normal completions to be performed synchronously.
This translates into a rather major performance improvement, reducing
latency by some 27%, and thereby improving performance altogether. (This
elminates two extra context switches per transaction!)
FInally, we can save some extra checks and conditions because we know
that completions cannot happen if we don't have a pending operation
(we no longer complete out of sequence), and we only call the dosend
operation when we have something to send. This can eliminate some
pipeline stalls.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #302 nng_dialer/listener/pipe_getopt_sockaddr desired
This adds plumbing to pass and check the type of options
all the way through.
NNG_ZT_OPT_ORBIT is type UINT64, but you can use the untyped form to
pass two of them if needed.
No typed access for retrieving strings yet. I think this should allocate
a pointer and copy that out, but that's for later.
|
| |
|
|
|
|
|
|
|
|
|
| |
fixes #275 nng_pipe_getopt_ptr() missing?
fixes #285 nng_setopt_ptr MIS
fixes #297 nng_listener/dialer_close does not validate mode
This change adds some missing APIs, and changes others.
In particular, certain options are now of type bool, with size
of just one. This is a *breaking* change for code that uses those
options -- NNG_OPT_RAW, NNG_OPT_PAIR1_POLY, NNG_OPT_TLS_VERIFIED.
|
| |
|
|
| |
fixes #290 sockaddr improvements
|
| |
|
|
|
|
|
|
| |
This causes TCP, TLS, and ZT endpoints to resolve any
wildcards, and even IP addresses, when reporting the listen
URL. The dialer URL is reported unresolved. Test cases
for this are added as well, and nngcat actually reports this
if --verbose is supplied.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
These are incremental updates... we avoid using install() in the
subdirectories, so that we can adapt properly to them in the
single parent directory.
We have started some of the work to improve support for CPack. This
is still not yet done, but work in progress.
|