| Commit message (Collapse) | Author | Age |
| ... | |
| | |
|
| |
|
|
|
|
|
|
|
| |
We need to remember that protocol stops can run synchronously, and
therefore we need to wait for the aio to complete. Further, we need
to break apart shutting down aio activity from deallocation, as we need
to shut down *all* async activity before deallocating *anything*.
Noticed that we had a pipe race in the surveyor pattern too.
|
| |
|
|
|
|
|
|
| |
We have seen some yet another weird situation where we had an orphaned
pipe, which was caused by not completing the callback. If we are going
to run nni_aio_fini, we should still run the callback (albeit with a
return value of NNG_ECANCELED or somesuch) to be sure that we can't
orphan stuff.
|
| |
|
|
|
|
|
|
|
|
|
| |
This one is caused by us deallocating the msg queue before we
stop all asynchronous I/O operations; consequently we can wind
up with a thread trying to access a msg queue after it has been
destroyed.
A lesson here is that nni_aio_fini() needs to be treated much like
nni_thr_fini() - you should do this *before* deallocating anything
that callback functions might be referencing.
|
| | |
|
| | |
|
| |
|
|
|
|
| |
This resolves the orphaned pipedesc, which actually could have affected
Windows too. I think maybe we are race free. Lots more testing is
still required, but stress runs seem to be passing now.
|
| |
|
|
|
|
| |
Apparently there are circumstances when a pipedesc may get orphaned form the
pollq. This triggers an assertion failure when it occurs. I am still
trying to understand how this can occur. Stay tuned.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
We have seen leaks of pipes causing test failures (e.g. the Windows
IPC test) due to EADDRINUSE. This was caused by a case where we
failed to pass the pipe up because the AIO had already been canceled,
and we didn't realize that we had oprhaned the pipe. The fix is to
add a return value to nni_aio_finish, and verify that we did finish
properly, or if we did not then we must free the pipe ourself. (The
zero return from nni_aio_finish indicates that it accepts ownership
of resources passed via the aio.)
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
| |
This fixes a potential nasty bug associated with the objhash table
resizing, and rewrites the scalability test to use just a single thread
handling some 2000 client sockets. This proves that the framework can
deal with vast numbers of sockets, regardless of the supported number
of operating system threads.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
This cleans up the pipe creation logic greatly, and eliminates
a nasty potential deadlock (lock-order incorrect.) It also
adds a corret binary exponential and randomized backoff on both
accept and connect.
|
| | |
|
| |
|
|
|
|
|
|
| |
We closed a few subtle races in the AIO subsystem as well, and now
we were able to eliminate the separate timer handling the MQ code.
There appear to be some opportunities to further enhance the code
for MQs as well -- eventually probably the only access to MQs will
be with AIOs.
|
| |
|
|
|
|
|
|
|
| |
Most of the races around close were probably here - the cancellation was
not getting through on endpoint close, which meant that we could actually
toss endpoints while they were in use.
We need to fix the timeouts stuff -- especially for reconnects etc, but
we are just about ready for this stuff to be reintegrated into master.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
| |
This actually is breaking at the moment, because we don't have
good integration with timeouts, and there are some frustrating
races with timeouts at points that can cause apparent hangs.
|
| | |
|
| |
|
|
|
|
| |
This logic leaves a race condition in the dial side, which will
be fixed with a subsequent change to convert that to fully asynchronous
as well.
|
| | |
|
| | |
|
| |
|
|
|
|
| |
This means that pipe_start always succeeds, and we can guarantee that
the pipe_start_cb is always executed, and in another context. This may help
when we need to change the way that sockets and endpoints are associated.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
| |
This is only lightly tested, and I expect that there remain
some race conditions. Endpoint logic in particular needs
work.
|
| |
|
|
|
|
| |
We still have endpoint related races apparently; we need to examine
the possibility of handling endpoints much like we do pipes, which seem
to be race free.
|
| |
|
|
|
| |
We are still seeing likely errors with pipes outliving their associated
endpoints, so work is still needed here.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The IOCP code has been refactored to improve reuse, and hopefully
will be easier to use with TCP now. Windows IPC using Named Pipes
is mostly working -- mostly because there is a gnarly close-race.
It seems that we need to take some more care to ensure that the
pipe is not released while requests may be outstanding -- so some
deeper synchronization between the IOCP callback logic and the
win_event code is needed. In short, we need to add a condvar to
the event, and notice when we have submitted work for async completion,
and make sure we flag the event "idle" after either completion or
cancellation of the event.
|
| | |
|
| | |
|
| | |
|