docs/man/nng_url_parse.3.adoc


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96

= nng_url_parse(3)
//
// Copyright 2018 Staysail Systems, Inc. <info@staysail.tech>
// Copyright 2018 Capitar IT Group BV <info@capitar.com>
//
// This document is supplied under the terms of the MIT License, a
// copy of which should be located in the distribution where this
// file was obtained (LICENSE.txt).  A copy of the license may also be
// found online at https://opensource.org/licenses/MIT.
//

== NAME

nng_url_parse - create URL structure from a string

== SYNOPSIS

[source, c]
----
#include <nng/nng.h>

int nng_url_parse(nng_url **urlp, const char *str);
----

== DESCRIPTION

The `nng_url_parse()` function parses the string _str_ containing an
https://tools.ietf.org/html/rfc3986[RFC 3986] compliant URL, and creates
a structure containing the results.  A pointer to the resulting structure
is stored in _urlp_.

The `nng_url` structure has at least the following members:

[source, c]
----
struct nng_url {
    char *u_rawurl;   // Unparsed URL, with minimal canonicalization.
    char *u_scheme;   // Scheme, such as "http"; always lower case.
    char *u_userinfo; // Userinfo component, or NULL.
    char *u_host;     // Full host, including port if present.
    char *u_hostname; // Hostname only (or address), or empty string.
    char *u_port;     // Port number, may be default or empty string.
    char *u_path;     // Path if present, empty string otherwise.
    char *u_query;    // Query info if present, NULL otherwise.
    char *u_fragment; // Fragment if present, NULL otherwise.
    char *u_requri;   // Request-URI (path[?query][#fragment])
};
----

=== URL Canonicalization

The `nng_url_parse()` function also canonicalizes the results, as
follows:

  1. The URL is parsed into the various components.
  2. The `u_scheme`, `u_hostname`, `u_host`, and `u_port` members are
     converted to lower case.
  3. Percent-encoded values for
     https://tools.ietf.org/html/rfc3986#section-2.3[unreserved characters]
     converted to their unencoded forms.
  4. Additionally URL percent-encoded values for characters in the path
     and with numeric values larger than 127 (i.e. not ASCII) are decoded.
  5. The resulting `u_path` is checked for invalid UTF-8 sequences, consisting
     of surrogate pairs, illegal byte sequences, or overlong encodings.
     If this check fails, then the entire URL is considered invalid, and
     the function returns `NNG_EINVAL`.
  6. Path segments consisting of `.` and `..` are resolved as per
     https://tools.ietf.org/html/rfc3986#section-6.2.2.3[RFC 3986 6.2.2.3].
  7. Further, empty path segments are removed, meaning that duplicate
     slash (`/`) separators are removed from the path.
  8. If a port was not specified, but the scheme defines a default
     port, then `u_port` will be filled in with the value of the default port.

TIP: Only the `u_userinfo`, `u_query`, and `u_fragment` members will ever be
     `NULL`.  The other members will be filled in with either default values
     or the empty string if they cannot be determined from _str_.

== RETURN VALUES

This function returns 0 on success, and non-zero otherwise.


== ERRORS

[horizontal]
`NNG_ENOMEM`:: Insufficient free memory exists to allocate a message.
`NNG_EINVAL`:: An invalid URL was supplied.


== SEE ALSO

[.text-left]
xref:nng_url_clone.3.adoc[nng_url_clone(3)],
xref:nng_url_free.3.adoc[nng_url_free(3)],
xref:nng_strerror.3.adoc[nng_strerror(3)],
xref:nng.7.adoc[nng(7)]