= nng_url_parse(3) :doctype: manpage :manmanual: nng :mansource: nng :manvolnum: 3 :copyright: Copyright 2018 Staysail Systems, Inc. \ Copyright 2018 Capitar IT Group BV \ This software is supplied under the terms of the MIT License, a \ copy of which should be located in the distribution where this \ file was obtained (LICENSE.txt). A copy of the license may also \ be found online at https://opensource.org/licenses/MIT. == NAME nng_url_parse - create URL structure from a string == SYNOPSIS [source, c] ----------- #include int nng_url_parse(nng_url **urlp, const char *str); ----------- == DESCRIPTION The `nng_url_parse()` function parses the string _str_ containing an https://tools.ietf.org/html/rfc3986[RFC 3986] compliant URL, and creates a structure containing the results. A pointer to the resulting structure is stored in _urlp_. The `nng_url` structure has at least the following members: `char *u_scheme`:: The scheme, such as `http`. Always lower case. `char *u_rawurl`:: An unparsed form of the raw URL, with only minimal canonicalization performed. `char *u_userinfo`:: The userinfo component if one was present, `NULL` otherwise. `char *u_host`:: The full host, including hostname, and colon and port if present, otehrwise the empty string. `char *u_hostname`:: The hostname if present, otherwise the empty string. Always lower case. `char *u_port`:: The port if present. If not present, a default port will be stored here. If no default is available, then the empty string. `char *u_path`:: The path component if present, or the empty string otherwise. `char *u_query`:: The query component if present, NULL otherwise. `char *u_fragment`:: The fragment if present, NULL otherwise. === URL Canonicalization The `nng_url_parse()` function also canonicalizes the results, as follows: 1. The URL is parsed into the various components. 2. The `u_scheme`, `u_hostname`, `u_host`, and `u_port` members are converted to lower case. 3. Percent-encoded values for https://tools.ietf.org/html/rfc3986#section-2.3[unreserved characters] converted to their unencoded forms. 4. Additionally URL percent-encoded values for characters in the path and with numeric values larger than 127 (i.e. not ASCII) are decoded. 5. The resulting `u_path` is checked for invalid UTF-8 sequences, consisting of surrogate pairs, illegal byte sequences, or overlong encodings. If this check fails, then the entire URL is considered invalid, and the function returns `NNG_EINVAL`. 6. Path segments consisting of `.` and `..` are resolved as per https://tools.ietf.org/html/rfc3986#section-6.2.2.3[RFC 3986 6.2.2.3]. 7. Further, empty path segments are removed, meaning that duplicate slash (`/`) separators are removed from the path. 8. If a port was not specified, but the scheme defines a default port, then `u_port` will be filled in with the value of the default port. == RETURN VALUES This function returns 0 on success, and non-zero otherwise. == ERRORS `NNG_ENOMEM`:: Insufficient free memory exists to allocate a message. `NNG_EINVAL`:: An invalid URL was supplied. == SEE ALSO <>, <>, <>, <> == COPYRIGHT Copyright 2018 mailto:info@staysail.tech[Staysail Systems, Inc.] + Copyright 2018 mailto:info@capitar.com[Capitar IT Group BV] This document is supplied under the terms of the https://opensource.org/licenses/MIT[MIT License].