summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Lets hold the lock a little while longer.Garrett D'Amore2017-07-20
|
* Yet more race condition fixes.Garrett D'Amore2017-07-20
| | | | | | | | | We need to remember that protocol stops can run synchronously, and therefore we need to wait for the aio to complete. Further, we need to break apart shutting down aio activity from deallocation, as we need to shut down *all* async activity before deallocating *anything*. Noticed that we had a pipe race in the surveyor pattern too.
* Always run the AIO completion logic.Garrett D'Amore2017-07-19
| | | | | | | | We have seen some yet another weird situation where we had an orphaned pipe, which was caused by not completing the callback. If we are going to run nni_aio_fini, we should still run the callback (albeit with a return value of NNG_ECANCELED or somesuch) to be sure that we can't orphan stuff.
* Crash on close again.Garrett D'Amore2017-07-19
| | | | | | | | | | | This one is caused by us deallocating the msg queue before we stop all asynchronous I/O operations; consequently we can wind up with a thread trying to access a msg queue after it has been destroyed. A lesson here is that nni_aio_fini() needs to be treated much like nni_thr_fini() - you should do this *before* deallocating anything that callback functions might be referencing.
* fixes #27 Double free found in stress testingGarrett D'Amore2017-07-19
|
* Possible division by zero error (unset backoff start time).Garrett D'Amore2017-07-18
|
* fixes #21 Crash in IPC (POSIX)Garrett D'Amore2017-07-18
| | | | | | This resolves the orphaned pipedesc, which actually could have affected Windows too. I think maybe we are race free. Lots more testing is still required, but stress runs seem to be passing now.
* Fixes most of the raaces in posix; but at least one remains outstanding.Garrett D'Amore2017-07-18
| | | | | | Apparently there are circumstances when a pipedesc may get orphaned form the pollq. This triggers an assertion failure when it occurs. I am still trying to understand how this can occur. Stay tuned.
* Sometimes providers don't clear the prov data details. (Backoff).Garrett D'Amore2017-07-18
|
* Fix close-related leak of pipes.Garrett D'Amore2017-07-18
| | | | | | | | | | | We have seen leaks of pipes causing test failures (e.g. the Windows IPC test) due to EADDRINUSE. This was caused by a case where we failed to pass the pipe up because the AIO had already been canceled, and we didn't realize that we had oprhaned the pipe. The fix is to add a return value to nni_aio_finish, and verify that we did finish properly, or if we did not then we must free the pipe ourself. (The zero return from nni_aio_finish indicates that it accepts ownership of resources passed via the aio.)
* Handle is INVALID_HANDLE_VALUE not NULLGarrett D'Amore2017-07-18
|
* Remove unused variables.Garrett D'Amore2017-07-17
|
* Fix unreferenced variable warnings and errors reported by MVSC.Garrett D'Amore2017-07-17
|
* Scalability test fixes.Garrett D'Amore2017-07-17
| | | | | | | | This fixes a potential nasty bug associated with the objhash table resizing, and rewrites the scalability test to use just a single thread handling some 2000 client sockets. This proves that the framework can deal with vast numbers of sockets, regardless of the supported number of operating system threads.
* Fix hang on double-close of socket.Garrett D'Amore2017-07-17
|
* Ditch unused nni_sock_hold() call.Garrett D'Amore2017-07-17
|
* Clean up pipes on fini. EP close sync with pipes.Garrett D'Amore2017-07-16
|
* Close negotiation race.Garrett D'Amore2017-07-16
|
* Inproc leak fixes.Garrett D'Amore2017-07-16
|
* Fix EAGAIN (timeout thread can run before we finish scheduling!)Garrett D'Amore2017-07-16
|
* Bind the pipe to the ep properly, and wake any closers needed.Garrett D'Amore2017-07-16
|
* Delete old #ifdef 0 pipe_create logic.Garrett D'Amore2017-07-16
|
* Fix locking errors in endpoints, and simplify some logic.Garrett D'Amore2017-07-16
| | | | | | | This cleans up the pipe creation logic greatly, and eliminates a nasty potential deadlock (lock-order incorrect.) It also adds a corret binary exponential and randomized backoff on both accept and connect.
* Reconnect automatically, but do backoff on failures. (Accept too!)Garrett D'Amore2017-07-16
|
* AIO timeouts work correctly now, using their own timer logic.Garrett D'Amore2017-07-16
| | | | | | | | We closed a few subtle races in the AIO subsystem as well, and now we were able to eliminate the separate timer handling the MQ code. There appear to be some opportunities to further enhance the code for MQs as well -- eventually probably the only access to MQs will be with AIOs.
* Add missing cancellation for inproc endpoints -- the source of much woe.Garrett D'Amore2017-07-15
| | | | | | | | | Most of the races around close were probably here - the cancellation was not getting through on endpoint close, which meant that we could actually toss endpoints while they were in use. We need to fix the timeouts stuff -- especially for reconnects etc, but we are just about ready for this stuff to be reintegrated into master.
* Fix incorrect attempt to proceed inproc.Garrett D'Amore2017-07-15
|
* More s/nni_aio_stop/nni_aio_cancel/Garrett D'Amore2017-07-15
|
* Bus, Req/Rep, and Surv/Resp should use aio_cancel instead of aio_stop.Garrett D'Amore2017-07-15
|
* IPC race condition fixes. These mirror what we did for TCP.Garrett D'Amore2017-07-15
|
* Race conditions removed... TCP tests work well know.Garrett D'Amore2017-07-15
|
* Some initial progress on *connect* async.Garrett D'Amore2017-07-15
| | | | | | This actually is breaking at the moment, because we don't have good integration with timeouts, and there are some frustrating races with timeouts at points that can cause apparent hangs.
* Close leaking lock for inproc.Garrett D'Amore2017-07-15
|
* Implemented asynchronous (fully) accept.Garrett D'Amore2017-07-14
| | | | | | This logic leaves a race condition in the dial side, which will be fixed with a subsequent change to convert that to fully asynchronous as well.
* Close a race during pipe creation.Garrett D'Amore2017-07-13
|
* Use the same pipe teardown in all circumstances.Garrett D'Amore2017-07-13
|
* Use the same flow regardless of whether pipe start is used or not.Garrett D'Amore2017-07-13
| | | | | | This means that pipe_start always succeeds, and we can guarantee that the pipe_start_cb is always executed, and in another context. This may help when we need to change the way that sockets and endpoints are associated.
* Simplify pipe logic, going back to idhash.Garrett D'Amore2017-07-13
|
* Now that idhash is locked, we can ditch some locking in protocols.Garrett D'Amore2017-07-13
|
* idhash has it's own lock now.Garrett D'Amore2017-07-13
|
* Make idhash non-inlined (so we can add a mutex.)Garrett D'Amore2017-07-13
|
* Close at least one of the race conditions in ipc closing.Garrett D'Amore2017-07-13
|
* Remove stale partial printf line causing syntax error.Garrett D'Amore2017-07-13
|
* Windows implmentation of TCP is "working now".Garrett D'Amore2017-07-13
| | | | | | This is only lightly tested, and I expect that there remain some race conditions. Endpoint logic in particular needs work.
* Attempts to minimize races, remove unused nni_sock_mtx function.Garrett D'Amore2017-07-12
| | | | | | We still have endpoint related races apparently; we need to examine the possibility of handling endpoints much like we do pipes, which seem to be race free.
* Fix likely close race in Windows ICP/IOCP code.Garrett D'Amore2017-07-12
| | | | | We are still seeing likely errors with pipes outliving their associated endpoints, so work is still needed here.
* Windows IPC working, mostly.Garrett D'Amore2017-07-11
| | | | | | | | | | | | | The IOCP code has been refactored to improve reuse, and hopefully will be easier to use with TCP now. Windows IPC using Named Pipes is mostly working -- mostly because there is a gnarly close-race. It seems that we need to take some more care to ensure that the pipe is not released while requests may be outstanding -- so some deeper synchronization between the IOCP callback logic and the win_event code is needed. In short, we need to add a condvar to the event, and notice when we have submitted work for async completion, and make sure we flag the event "idle" after either completion or cancellation of the event.
* Eliminate the separate wrapping structure for platform mtx and cv.Garrett D'Amore2017-07-11
|
* Make better use of enums (makes clang-format happier.)Garrett D'Amore2017-07-10
|
* Give up on uncrustify; switch to clang-format.Garrett D'Amore2017-07-10
|