summaryrefslogtreecommitdiff
path: root/src/platform
Commit message (Collapse)AuthorAge
* fixes #767 Want tunable stack sizeGarrett D'Amore2018-10-31
|
* Fix win file type (#749)Hunter2018-10-12
| | | | | before: nni_file_is_dir with path D:\\ D:/ D: all returns false after: nni_file_is_dir with path D:\\ D:/ D: all returns true
* fixes #712 windows nni_atomic_set64 should not return valueGarrett D'Amore2018-09-09
|
* fixes #687 POLLHUP is problematic on macOS too...Garrett D'Amore2018-08-30
| | | | | | | | | Basically, we can ignore EV_EOF, as we wind up still alerting the corresponding events. EV_ERROR we still treat as HUP. (The EV_EOF was responsible for prematurely closing the socket and aborting transactions while there was still data in the socket buffers.)
* fixes #683 atomic 64 stuff broken on pre-C11 stacksGarrett D'Amore2018-08-29
|
* fixes #674 want 64-bit atomics (for stats)Garrett D'Amore2018-08-27
|
* fixes #608 Add TCP support to specify local network interfaceGarrett D'Amore2018-08-27
| | | | | This also fixes a leaked TCP connection on a failure path, which we noticed while working this change.
* fixes #668 Remove the old win_event stuffGarrett D'Amore2018-08-20
|
* fixes #665 Convert Windows UDP to raw IOCPGarrett D'Amore2018-08-20
|
* fixes #664 aio cancellation could be betterGarrett D'Amore2018-08-20
| | | | | | | | | This changes the signature of the aio cancellation routines to take the argument for cancellation directly, so we do not need to lookup the argument using the nni_aio_get_prov_data. We should probably consider eliminating nni_aio_get_prov_data, and co, and changing the prov_extra to reflect prov_data. Later.
* fixes #653 Weird connect failures in dialer (multistress) (#660)Garrett D'Amore2018-08-20
| | | | | | | The POLLHUP (or rather EPOLLHUP) flag does not quite mean the same thing in Linux, and we've seen random failures where we will sometimes get this event on a socket that is freshly connected. This might be a bug in Linux, but it is easy enough to workaround -- we just don't watch for it at all.
* fixed compilation error on OpenBSD, missing sockpeercred (#659)Francisc Simon2018-08-19
| | | fixed compilation error on OpenBSD, missing sockpeercred
* fixes #656 Don't connect/remove IPC sockets unless bind failsGarrett D'Amore2018-08-16
|
* fixes #611 Memory Leaks under WindowsGarrett D'Amore2018-08-06
| | | | | | | | | | | | | fixes #622 incorrect assumptions about malloc(0) Windows actually allocates an object of size zero when calling malloc on size zero. This is unusual behavior, and we just add logic to work more like malloc on POSIX systems. Other systems can return non-NULL objects to fixed pages here. We think the best option here is to uniformly return NULL from our APIs in these circumstances, and to include testing to validate that.
* Revert "fixes #599 nng_dial sync should not return until added to socket"Garrett D'Amore2018-08-06
| | | | | This changeset needs work. We are seeing errors described by This reverts commit d7f7c896c0ede24249ef63b1e45b1878bf4bd473.
* fixes #599 nng_dial sync should not return until added to socketGarrett D'Amore2018-08-05
| | | | | | | | | | fixes #208 pipe start should occur before connect / accept fixes #616 Race condition closing between header & body This refactors the transports to handle their own connection handshaking before passing the pipe to the socket. This changes and simplifies the setup. This also fixes a rather challenging race condition described by #616.
* fixes #605 NNI_ALLOC_STRUCT/NNI_ALLOC_STRUCTS should zero memoryGarrett D'Amore2018-07-24
|
* Modified code to explicitly set hints.ai_socktype passed to getaddrinfo(). ↵Mike Bush2018-07-24
| | | | On QNX, specifying a numeric servname while leaving ai_socktype unspecified would result in EAI_SERVICE.
* fixes #595 mutex leak and other minor errors in TCPGarrett D'Amore2018-07-18
| | | | | | | | | | | | | | | | fixes #596 POSIX IPC should move away from pipedesc/epdesc fixes #598 TLS and TCP listeners could support NNG_OPT_LOCADDR fixes #594 Windows IPC should use "new style" win_io code. fixes #597 macOS could support PEER PID This large change set cleans up the IPC support on Windows and POSIX. This has the beneficial impact of significantly reducing the complexity of the code, reducing locking, increasing concurrency (multiple dial and accepts can be outstanding now), reducing context switches (we complete thins synchronously now). While here we have added some missing option support, and fixed a few more bugs that we found in the TCP code changes from last week.
* fixes #591 incorrect reuse of server instances by websocketGarrett D'Amore2018-07-16
| | | | | This also arranges for server shutdown to be handled using the reaper, leading to more elegant cleanup.
* fixes #523 dialers could support multiple outstanding dial requestsGarrett D'Amore2018-07-16
| | | | | | | | | | | | | | | | | | | | | | | | fixes #179 DNS resolution should be done at connect time fixes #586 Windows IO completion port work could be better fixes #339 Windows iocp could use synchronous completions fixes #280 TCP abstraction improvements This is a rather monstrous set of changes, which refactors TCP, and the underlying Windows I/O completion path logic, in order to obtain a cleaner, simpler API, with support for asynchronous DNS lookups performed on connect rather than initialization time, the ability to have multiple connects or accepts pending, as well as fewer extraneous function calls. The Windows code also benefits from greatly reduced context switching, fewer lock operations performed, and a reduced number of system calls on the hot code path. (We use automatic event resetting instead of manual.) Some dead code was removed as well, and a few potential edge case leaks on failure paths (in the websocket code) were plugged. Note that all TCP based transports benefit from this work. The IPC code on Windows still uses the legacy IOCP for now, as does the UDP code (used for ZeroTier.) We will be converting those soon too.
* fixes #566 Windows iov resubmit routine is not used.Garrett D'Amore2018-07-06
|
* fixes #576 IPC listen unlinks UNIX socket on failureGarrett D'Amore2018-07-05
|
* fixes #575 kqueue spins hardGarrett D'Amore2018-07-04
| | | | | This sets the kqueue events to autoclear, reducing CPU usage to normal sane levels, and eliminating the hard spin.
* fixes #572 Several locking errors foundGarrett D'Amore2018-07-03
| | | | | | | | | | fixes #573 atomic flags could help This introduces a new atomic flag, and reduces some of the global locking. The lock refactoring work is not yet complete, but this is a positive step forward, and should help with certain things. While here we also fixed a compile warning due to incorrect types.
* fixes #32 autoscale based on CPUs availableGarrett D'Amore2018-06-12
| | | | | | This should work on both Windows and the most common POSIX variants. We will create at least two threads for running completions, but there are numerous other threads in the code.
* fixes #525 posix nni_plat_tcp_ep_init should not mark mode unusedGarrett D'Amore2018-06-11
|
* Adding sys/stat.h to src/platform/posix/posix_aio.h for building with muslMark Stevens2018-06-04
|
* fixes #499 Eliminate the unused nni_plat_home_dir...Garrett D'Amore2018-05-30
|
* fixes #477 Android NDK build configurationGarrett D'Amore2018-05-30
| | | | | | | | | | | | This enables the software to be built for Android, going back to at least Android SDK r15 (IceCreamSandwich) and at least up to SDK r27 (Oreo). Older versions of Android may work, but we have no way to build them to test. While here we have changed our CMake configuration to disable building tools or tests when we detect a cross-compile situation. Documentation for cross-compilation is updated as well.
* fixes #484 crashes in websocket transportGarrett D'Amore2018-05-29
| | | | | | | | | | | | | | | | | | | | | | | fixes #490 posix_epdesc use-after-free bug fixes #489 Sanitizer based testing would help fixes #492 Numerous memory leaks found with sanitizer This introduces support for compiler-based sanitizers when using clang or gcc (and not on Windows). See NNG_SANITIZER for possible settings such as "thread" or "address". Furthermore, we have fixed the issues we found with both the thread and address sanitizers. We believe that the thread issues pointed to a low frequency use-after-free responsible for rare crashes in some of the tests. The tests generally have their timeouts doubled when running under a sanitizer, to account for the extra long times that the sanitizer can cause these to take. While here, we also changed the compat_ws test to avoid a particularly painful and time consuming DNS lookup, and we made the nngcat_unlimited test a bit more robust by waiting before sending traffic.
* fixes #488 pthread mutex initializer could be simplerGarrett D'Amore2018-05-29
| | | | | | | The fallback logic was unnecessarily complicated, and found to be somewhat data-racy; on modern systems initializing these things never fails, and on BSD systems that only occurs under extreme memory shortage.
* fixes #471 Linux resolver truncates port number silentlyGarrett D'Amore2018-05-21
|
* fixes #469 SO_REUSEADDR should be enabledGarrett D'Amore2018-05-21
| | | | | | | | | | | | | | | | | | | fixes #468 TCP nodelay and keepalive should start usable fixes #467 NN_RCVMAXSZ option does not work (compat) fixes #465 Support NN_OPT_TCPNODELAY (compat) This is a rather larger change set than I'd like, but when adding support for legacy TCP keepalive, I found a number if issues using the legacy TCP test (which we are introducing with this commit.) This fixes the concerns that are relevant and addressible. We have elected not to try to support to local address binding at this time, and the IPv6 test case in the old code was wrong, so changes relevant to that are commented out. I've also updated the nng_compat manual page to reflect additional caveats that folks should be aware of, including the previously undocumented caveat around the NN_SNDBUF and NN_RCVBUF options.
* fixes #462 mode_t missing during compilationGarrett D'Amore2018-05-20
|
* fixes #451 task finalization could be better/smarter (resolver)Garrett D'Amore2018-05-17
| | | | | | | | | | | | | | | | | | | | | | | This changes nni_task_fini to always run synchronously, waiting for the task to finish before cleaning up. Much simpler code. Additionally, we've refactored the resolver code to avoid the use of taskqs, which added complexity and inefficiency. The approach of just allocating its own threads and a work queue to process them turns out to be vastly simpler, and actually reduces extra allocations and context switches. wip POSIX resolv threads. (Taskqs are just overhead and complexity here.) Windows resolver changes. Task cleanup. fix up windows mutex.
* fixes #430 Unable to build in MSYS + Win-buildsGarrett D'Amore2018-05-15
| | | | fixes #438 Consider dropping AI_V4MAPPED
* fixes #352 aio lock is burning hotGarrett D'Amore2018-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #326 consider nni_taskq_exec_synch() fixes #410 kqueue implementation could be smarter fixes #411 epoll_implementation could be smarter fixes #426 synchronous completion can lead to panic fixes #421 pipe close race condition/duplicate destroy This is a major refactoring of two significant parts of the code base, which are closely interrelated. First the aio and taskq framework have undergone a number of simplifications, and improvements. We have ditched a few parts of the internal API (for example tasks no longer support cancellation) that weren't terribly useful but added a lot of complexity, and we've made aio_schedule something that now checks for cancellation or other "premature" completions. The aio framework now uses the tasks more tightly, so that aio wait can devolve into just nni_task_wait(). We did have to add a "task_prep()" step to prevent race conditions. Second, the entire POSIX poller framework has been simplified, and made more robust, and more scalable. There were some fairly inherent race conditions around the shutdown/close code, where we *thought* we were synchronizing against the other thread, but weren't doing so adequately. With a cleaner design, we've been able to tighten up the implementation to remove these race conditions, while substantially reducing the chance for lock contention, thereby improving scalability. The illumos poller also got a performance boost by polling for multiple events. In highly "busy" systems, we expect to see vast reductions in lock contention, and therefore greater scalability, in addition to overall improved reliability. One area where we currently can do better is that there is still only a single poller thread run. Scaling this out is a task that has to be done differently for each poller, and carefuly to ensure that close conditions are safe on all pollers, and that no chance for deadlock/livelock waiting for pfd finalizers can occur.
* Fix double unlock.Austin Wise2018-05-07
|
* fixes #393 panic on illumos - epoll assertion errorGarrett D'Amore2018-05-06
| | | | | | | | This replaces the epoll support with proper illumos/SunOS port events. The port event support is structured so that it actually is superior to epoll and kqueue, because it avoids a single master lock on the poller. In the future we will explore this for macOS and Linux pollers.
* fixes #396 illumos doesn't build (missing NNG_PLATFORM_POSIX ON)Garrett D'Amore2018-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | fixes #397 Need to cast zoneid fixes #395 sun is predefined on illumos/Solaris fixes #394 alloca needs to #include <alloca.h> fixes #399 Cannot use SVR4.2 specific msghdr fixes #402 getpeerucred needs a NULL initialized ucred fixes #403 syntax error in posix_tcp - attempt to return void fixes #407 illumos getegid wrong fixes #406 nni_idhash_count is dead code fixes #404 idhash typedef redeclared fixes #405 warning: newline not last character in file This is basically a slew of related bug fixes required to make this work on illumos. Note that the fixes are not "complete", because more work is required to support port events given that epoll is busted on illumos. We also fixed a bunch of things that aren't actually "bugs" per se, but really just warnings. Silencing them makes things better for everyone. Apparently not all compilers are equally happy with redundant (but otherwise identical) typedefs; we use structs in some places instead of shorter type names to silence these complaints. Note that IPC permissions (the mode bits on the socket vnode) are not validated on SunOS systems. This change includes documentation to reflect that.
* fixes #383 Would like peerid for IPCGarrett D'Amore2018-05-03
| | | | | We offer uid, gid, process id, and even zone id where we have them. Docs and tests are provided.
* fixes #6 Security attributes supportGarrett D'Amore2018-04-30
| | | | | | | | | | fixes #382 Permissions support for IPC on POSIX This adds support for permission management on Windows and POSIX systems. There are two different properties, and they are very different. Tests and documentation are included.
* fixes #105 Want NNG_OPT_TCP_NODELAY optionGarrett D'Amore2018-04-26
| | | | fixes #106 TCP keepalive tuning
* fix a number of cppcheck complaints (not all)Garrett D'Amore2018-04-24
|
* Eliminate unused varable.Garrett D'Amore2018-04-24
|
* fixes #346 nng_recv() sometimes acts on null `msg` pointerGarrett D'Amore2018-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This closes a fundamental flaw in the way aio structures were handled. In paticular, aio expiration could race ahead, and fire before the aio was properly registered by the provider. This ultimately led to the possibility of duplicate completions on the same aio. The solution involved breaking up nni_aio_start into two functions. nni_aio_begin (which can be run outside of external locks) simply validates that nni_aio_fini() has not been called, and clears certain fields in the aio to make it ready for use by the provider. nni_aio_schedule does the work to register the aio with the expiration thread, and should only be called when the aio is actually scheduled for asynchronous completion. nni_aio_schedule_verify does the same thing, but returns NNG_ETIMEDOUT if the aio has a zero length timeout. This change has a small negative performance impact. We have plans to rectify that by converting nni_aio_begin to use a locklesss flag for the aio->a_fini bit. While we were here, we fixed some error paths in the POSIX subsystem, which would have returned incorrect error codes, and we made some optmizations in the message queues to reduce conditionals while holding locks in the hot code path.
* fixes bad address tests in tcp and tls (#345)toppk2018-04-14
| | | | | | | * 127.0.0.1.32 is treated as a hostname, returns EAI_NODATA on my fedora 27 box * since this is not in POSIX, and deprecated from some libc resolvers protect with an ifdef
* fixes #338 possible SIGPIPE on LinuxGarrett D'Amore2018-04-10
|
* Eliminate possible data race on file descriptor.Garrett D'Amore2018-04-10
| | | | Turns out that shutdown is sufficient for most needs.