tld.h

tld.h — TLD-related functions

Functions

const char * tld_strerror ()
int tld_get_4 ()
int tld_get_4z ()
int tld_get_z ()
const Tld_table * tld_get_table ()
const Tld_table * tld_default_table ()
int tld_check_4t ()
int tld_check_4tz ()
int tld_check_4 ()
int tld_check_4z ()
int tld_check_8z ()
int tld_check_lz ()

Types and Values

#define IDNAPI
struct Tld_table_element
struct Tld_table
enum Tld_rc

Description

TLD-related functions.

Functions

tld_strerror ()

const char *
tld_strerror (Tld_rc rc);

Convert a return code integer to a text string. This string can be used to output a diagnostic message to the user.

TLD_SUCCESS: Successful operation. This value is guaranteed to always be zero, the remaining ones are only guaranteed to hold non-zero values, for logical comparison purposes. TLD_INVALID: Invalid character found. TLD_NODATA: No input data was provided. TLD_MALLOC_ERROR: Error during memory allocation. TLD_ICONV_ERROR: Character encoding conversion error. TLD_NO_TLD: No top-level domain found in domain string.

Parameters

rc

tld return code

 

Returns

Returns a pointer to a statically allocated string containing a description of the error with the return code rc .


tld_get_4 ()

int
tld_get_4 (const uint32_t *in,
           size_t inlen,
           char **out);

Isolate the top-level domain of in and return it as an ASCII string in out .

Parameters

in

Array of unicode code points to process. Does not need to be zero terminated.

 

inlen

Number of unicode code points.

 

out

Zero terminated ascii result string pointer.

 

Returns

Return TLD_SUCCESS on success, or the corresponding Tld_rc error code otherwise.


tld_get_4z ()

int
tld_get_4z (const uint32_t *in,
            char **out);

Isolate the top-level domain of in and return it as an ASCII string in out .

Parameters

in

Zero terminated array of unicode code points to process.

 

out

Zero terminated ascii result string pointer.

 

Returns

Return TLD_SUCCESS on success, or the corresponding Tld_rc error code otherwise.


tld_get_z ()

int
tld_get_z (const char *in,
           char **out);

Isolate the top-level domain of in and return it as an ASCII string in out . The input string in may be UTF-8, ISO-8859-1 or any ASCII compatible character encoding.

Parameters

in

Zero terminated character array to process.

 

out

Zero terminated ascii result string pointer.

 

Returns

Return TLD_SUCCESS on success, or the corresponding Tld_rc error code otherwise.


tld_get_table ()

const Tld_table *
tld_get_table (const char *tld,
               const Tld_table **tables);

Get the TLD table for a named TLD by searching through the given TLD table array.

Parameters

tld

TLD name (e.g. "com") as zero terminated ASCII byte string.

 

tables

Zero terminated array of Tld_table info-structures for TLDs.

 

Returns

Return structure corresponding to TLD tld by going thru tables , or return NULL if no such structure is found.


tld_default_table ()

const Tld_table *
tld_default_table (const char *tld,
                   const Tld_table **overrides);

Get the TLD table for a named TLD, using the internal defaults, possibly overridden by the (optional) supplied tables.

Parameters

tld

TLD name (e.g. "com") as zero terminated ASCII byte string.

 

overrides

Additional zero terminated array of Tld_table info-structures for TLDs, or NULL to only use library default tables.

 

Returns

Return structure corresponding to TLD tld_str , first looking through overrides then thru built-in list, or NULL if no such structure found.


tld_check_4t ()

int
tld_check_4t (const uint32_t *in,
              size_t inlen,
              size_t *errpos,
              const Tld_table *tld);

Test each of the code points in in for whether or not they are allowed by the data structure in tld , return the position of the first character for which this is not the case in errpos .

Parameters

in

Array of unicode code points to process. Does not need to be zero terminated.

 

inlen

Number of unicode code points.

 

errpos

Position of offending character is returned here.

 

tld

A Tld_table data structure representing the restrictions for which the input should be tested.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all code points are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.


tld_check_4tz ()

int
tld_check_4tz (const uint32_t *in,
               size_t *errpos,
               const Tld_table *tld);

Test each of the code points in in for whether or not they are allowed by the data structure in tld , return the position of the first character for which this is not the case in errpos .

Parameters

in

Zero terminated array of unicode code points to process.

 

errpos

Position of offending character is returned here.

 

tld

A Tld_table data structure representing the restrictions for which the input should be tested.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all code points are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.


tld_check_4 ()

int
tld_check_4 (const uint32_t *in,
             size_t inlen,
             size_t *errpos,
             const Tld_table **overrides);

Test each of the code points in in for whether or not they are allowed by the information in overrides or by the built-in TLD restriction data. When data for the same TLD is available both internally and in overrides , the information in overrides takes precedence. If several entries for a specific TLD are found, the first one is used. If overrides is NULL, only the built-in information is used. The position of the first offending character is returned in errpos .

Parameters

in

Array of unicode code points to process. Does not need to be zero terminated.

 

inlen

Number of unicode code points.

 

errpos

Position of offending character is returned here.

 

overrides

A Tld_table array of additional domain restriction structures that complement and supersede the built-in information.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all code points are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.


tld_check_4z ()

int
tld_check_4z (const uint32_t *in,
              size_t *errpos,
              const Tld_table **overrides);

Test each of the code points in in for whether or not they are allowed by the information in overrides or by the built-in TLD restriction data. When data for the same TLD is available both internally and in overrides , the information in overrides takes precedence. If several entries for a specific TLD are found, the first one is used. If overrides is NULL, only the built-in information is used. The position of the first offending character is returned in errpos .

Parameters

in

Zero-terminated array of unicode code points to process.

 

errpos

Position of offending character is returned here.

 

overrides

A Tld_table array of additional domain restriction structures that complement and supersede the built-in information.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all code points are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.


tld_check_8z ()

int
tld_check_8z (const char *in,
              size_t *errpos,
              const Tld_table **overrides);

Test each of the characters in in for whether or not they are allowed by the information in overrides or by the built-in TLD restriction data. When data for the same TLD is available both internally and in overrides , the information in overrides takes precedence. If several entries for a specific TLD are found, the first one is used. If overrides is NULL, only the built-in information is used. The position of the first offending character is returned in errpos . Note that the error position refers to the decoded character offset rather than the byte position in the string.

Parameters

in

Zero-terminated UTF8 string to process.

 

errpos

Position of offending character is returned here.

 

overrides

A Tld_table array of additional domain restriction structures that complement and supersede the built-in information.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all characters are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.


tld_check_lz ()

int
tld_check_lz (const char *in,
              size_t *errpos,
              const Tld_table **overrides);

Test each of the characters in in for whether or not they are allowed by the information in overrides or by the built-in TLD restriction data. When data for the same TLD is available both internally and in overrides , the information in overrides takes precedence. If several entries for a specific TLD are found, the first one is used. If overrides is NULL, only the built-in information is used. The position of the first offending character is returned in errpos . Note that the error position refers to the decoded character offset rather than the byte position in the string.

Parameters

in

Zero-terminated string in the current locales encoding to process.

 

errpos

Position of offending character is returned here.

 

overrides

A Tld_table array of additional domain restriction structures that complement and supersede the built-in information.

 

Returns

Returns the Tld_rc value TLD_SUCCESS if all characters are valid or when tld is null, TLD_INVALID if a character is not allowed, or additional error codes on general failure conditions.

Types and Values

IDNAPI

#define             IDNAPI

Symbol holding shared library API visibility decorator.

This is used internally by the library header file and should never be used or modified by the application.

https://www.gnu.org/software/gnulib/manual/html_node/Exported-Symbols-of-Shared-Libraries.html


struct Tld_table_element

struct Tld_table_element {
    uint32_t start;
    uint32_t end;
};

Interval of valid code points in the TLD.

Members

uint32_t start;

Start of range.

 

uint32_t end;

End of range, end == start if single.

 

struct Tld_table

struct Tld_table {
    const char *name;
    const char *version;
    size_t nvalid;
    const Tld_table_element *valid;
};

List valid code points in a TLD.

Members

const char *name;

TLD name, e.g., "no".

 

const char *version;

Version string from TLD file.

 

size_t nvalid;

Number of entries in data.

 

const Tld_table_element *valid;

Sorted array (of size nvalid ) of valid code points.

 

enum Tld_rc

Enumerated return codes of the TLD checking functions. The value 0 is guaranteed to always correspond to success.

Members

TLD_SUCCESS

Successful operation. This value is guaranteed to always be zero, the remaining ones are only guaranteed to hold non-zero values, for logical comparison purposes.

 

TLD_INVALID

Invalid character found.

 

TLD_NODATA

No input data was provided.

 

TLD_MALLOC_ERROR

Error during memory allocation.

 

TLD_ICONV_ERROR

Character encoding conversion error.

 

TLD_NO_TLD

No top-level domain found in domain string.

 

TLD_NOTLD

Same as TLD_NO_TLD , for compatibility with typo in earlier versions.