| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #596 POSIX IPC should move away from pipedesc/epdesc
fixes #598 TLS and TCP listeners could support NNG_OPT_LOCADDR
fixes #594 Windows IPC should use "new style" win_io code.
fixes #597 macOS could support PEER PID
This large change set cleans up the IPC support on Windows and
POSIX. This has the beneficial impact of significantly reducing
the complexity of the code, reducing locking, increasing
concurrency (multiple dial and accepts can be outstanding now),
reducing context switches (we complete thins synchronously now).
While here we have added some missing option support, and fixed a
few more bugs that we found in the TCP code changes from last week.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #326 consider nni_taskq_exec_synch()
fixes #410 kqueue implementation could be smarter
fixes #411 epoll_implementation could be smarter
fixes #426 synchronous completion can lead to panic
fixes #421 pipe close race condition/duplicate destroy
This is a major refactoring of two significant parts of the code base,
which are closely interrelated.
First the aio and taskq framework have undergone a number of simplifications,
and improvements. We have ditched a few parts of the internal API (for
example tasks no longer support cancellation) that weren't terribly useful
but added a lot of complexity, and we've made aio_schedule something that
now checks for cancellation or other "premature" completions. The
aio framework now uses the tasks more tightly, so that aio wait can
devolve into just nni_task_wait(). We did have to add a "task_prep()"
step to prevent race conditions.
Second, the entire POSIX poller framework has been simplified, and made
more robust, and more scalable. There were some fairly inherent race
conditions around the shutdown/close code, where we *thought* we were
synchronizing against the other thread, but weren't doing so adequately.
With a cleaner design, we've been able to tighten up the implementation
to remove these race conditions, while substantially reducing the chance
for lock contention, thereby improving scalability. The illumos poller
also got a performance boost by polling for multiple events.
In highly "busy" systems, we expect to see vast reductions in lock
contention, and therefore greater scalability, in addition to overall
improved reliability.
One area where we currently can do better is that there is still only
a single poller thread run. Scaling this out is a task that has to be done
differently for each poller, and carefuly to ensure that close conditions
are safe on all pollers, and that no chance for deadlock/livelock waiting
for pfd finalizers can occur.
|
| |
|
|
|
|
|
|
| |
This replaces the epoll support with proper illumos/SunOS port
events. The port event support is structured so that it actually
is superior to epoll and kqueue, because it avoids a single master
lock on the poller. In the future we will explore this for macOS
and Linux pollers.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fixes #397 Need to cast zoneid
fixes #395 sun is predefined on illumos/Solaris
fixes #394 alloca needs to #include <alloca.h>
fixes #399 Cannot use SVR4.2 specific msghdr
fixes #402 getpeerucred needs a NULL initialized ucred
fixes #403 syntax error in posix_tcp - attempt to return void
fixes #407 illumos getegid wrong
fixes #406 nni_idhash_count is dead code
fixes #404 idhash typedef redeclared
fixes #405 warning: newline not last character in file
This is basically a slew of related bug fixes required to make this
work on illumos. Note that the fixes are not "complete", because
more work is required to support port events given that epoll is busted
on illumos.
We also fixed a bunch of things that aren't actually "bugs" per se, but
really just warnings. Silencing them makes things better for everyone.
Apparently not all compilers are equally happy with redundant (but
otherwise identical) typedefs; we use structs in some places instead of
shorter type names to silence these complaints.
Note that IPC permissions (the mode bits on the socket vnode) are not
validated on SunOS systems. This change includes documentation to reflect
that.
|
| |
|
|
|
| |
We offer uid, gid, process id, and even zone id where we have them.
Docs and tests are provided.
|
| |
|
|
| |
fixes #106 TCP keepalive tuning
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This closes a fundamental flaw in the way aio structures were
handled. In paticular, aio expiration could race ahead, and
fire before the aio was properly registered by the provider.
This ultimately led to the possibility of duplicate completions
on the same aio.
The solution involved breaking up nni_aio_start into two functions.
nni_aio_begin (which can be run outside of external locks) simply
validates that nni_aio_fini() has not been called, and clears certain
fields in the aio to make it ready for use by the provider.
nni_aio_schedule does the work to register the aio with the expiration
thread, and should only be called when the aio is actually scheduled
for asynchronous completion. nni_aio_schedule_verify does the same thing,
but returns NNG_ETIMEDOUT if the aio has a zero length timeout.
This change has a small negative performance impact. We have plans to
rectify that by converting nni_aio_begin to use a locklesss flag for
the aio->a_fini bit.
While we were here, we fixed some error paths in the POSIX subsystem,
which would have returned incorrect error codes, and we made some
optmizations in the message queues to reduce conditionals while holding
locks in the hot code path.
|
| | |
|
| |
|
|
| |
Turns out that shutdown is sufficient for most needs.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible for pollq arm to be called on a node that was removed
in some circumstances -- particularly and ep that was closed in the
callback.
While here, lets use normal booleans for closed state, and only call
the arm function (which is not free -- typicall it involves a mutex
and may even involve a system call) if we are going to arm some events.
We also initialize these things properly, and clean up a stale comment.
This work is done to faciliate the kqueue work by @liamstask.
|
| |
|
|
|
|
|
|
|
|
| |
This change is being made to facilitate the work done for the
kqueue port. We have created two new functions, nni_posix_pollq_init
and nni_posix_pollq_fini, which can be used when creating or destroying
the pollq nodes. Then nodes are *added* and *removed* from the pollq
structure with nni_posix_pollq_add and nni_posix_pollq_remove. The
add function in particular MUST NEVER be called unless the node has
a valid file descriptor.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces enough of the HTTP API to support fully server
applications, including creation of websocket style protocols,
pluggable handlers, and so forth.
We have also introduced scatter/gather I/O (rudimentary) for
aios, and made other enhancements to the AIO framework. The
internals of the AIOs themselves are now fully private, and we
have eliminated the aio->a_addr member, with plans to remove the
pipe and possibly message members as well.
A few other minor issues were found and fixed as well.
The HTTP API includes request, response, and connection objects,
which can be used with both servers and clients. It also defines
the HTTP server and handler objects, which support server applications.
Support for client applications will require a client object to be
exposed, and that should be happening shortly.
None of this is "documented" yet, bug again, we will follow up shortly.
|
| |
|
|
| |
fixes #155 POSIX TCP & IPC could avoid a lot of context switches
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
| |
Added TCP socket address properties on pipes.
This adds the plumbing for the various platform specifics, and
includes both v4 and v6 handling.
We've included a TCPv6 test as well.
|
| |
|
|
|
|
|
|
|
|
| |
We only compile files that are appropriate for the platform. (We
still have guards in place, to allow for a future single .C file
to be built from all the sources.) We also remove the subsystem defines;
if a new platform needs to deviate from POSIX in ways beyond what we
intended here, then that platform should just copy those parts into
a new platform directory, rather than cross including portions from
POSIX.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the underlying platform fails (FreeBSD is the only one I'm aware
of that does this!), we use a global lock or condition variable instead.
This means that our lock initializers never ever fail.
Probably we could eliminate most of this for Linux and Darwin, since
on those platforms, mutex and condvar initialization reasonably never
fails. Initial benchmarks show little difference either way -- so we
can revisit (optimize) later.
This removes a lot of otherwise untested code in error cases and so forth,
improving coverage and resilience in the face of allocation failures.
Platforms other than POSIX should follow a similar pattern if they need
this. (VxWorks, I'm thinking of you.) Most sane platforms won't have
an issue here, since normally these initializations do not need to allocate
memory. (Reportedly, even FreeBSD has plans to "fix" this in libthr2.)
While here, some bugs were fixed in initialization & teardown.
The fallback code is properly tested with dedicated test cases.
|
| |
|
|
|
|
|
|
|
| |
This passes valgrind 100% clean for both helgrind and deep leak
checks. This represents a complete rethink of how the AIOs work,
and much simpler synchronization; the provider API is a bit simpler
to boot, as a number of failure modes have been simply eliminated.
While here a few other minor bugs were squashed.
|
| | |
|
| |
|
|
|
|
|
|
|
| |
We need to remember that protocol stops can run synchronously, and
therefore we need to wait for the aio to complete. Further, we need
to break apart shutting down aio activity from deallocation, as we need
to shut down *all* async activity before deallocating *anything*.
Noticed that we had a pipe race in the surveyor pattern too.
|
| |
|
|
|
|
| |
Apparently there are circumstances when a pipedesc may get orphaned form the
pollq. This triggers an assertion failure when it occurs. I am still
trying to understand how this can occur. Stay tuned.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out that I had to fix a number of subtle asynchronous
handling bugs, but now TCP is fully asynchronous. We need to
change the high-level dial and listen interfaces to be async
as well.
Some of the transport APIs have changed here, and I've elected
to change what we expose to consumers as endpoints into seperate
dialers and listeners. Under the hood they are the same, but
it turns out that its helpful to know the intended use of the
endpoint at initialization time.
Scalability still occasionally hangs on Linux. Investigation
pending.
|
| | |
|
| |
|