For the purposes of this regex, the authority/domain of a URL comes after the scheme + "//", and includes an optional username, password, and port.
This is a perl-compatible regex (PCRE) that captures the various parts of the a domain, including the (optional) username, (optional) password, host, and (optional) port. The capturing groups are as follows: 1 = username, 2 = password, 3 = host, 4 = post. See the source link for the logic behind parsing the domain. NOTE: This is NOT intended to parse entire URLs, you will need a separate regular expression to extract the domain. Technically, only strings with newline characters are non-matches. The rest yield empty capturing groups. ANOTHER NOTE: This does NOT verify that only ascii characters are used in domain names. It is intended to extract pieces from domains that should already be valid.