Expression |
(::|(([a-fA-F0-9]{1,4}):){7}(([a-fA-F0-9]{1,4}))|(:(:([a-fA-F0-9]{1,4})){1,6})|((([a-fA-F0-9]{1,4}):){1,6}:)|((([a-fA-F0-9]{1,4}):)(:([a-fA-F0-9]{1,4})){1,6})|((([a-fA-F0-9]{1,4}):){2}(:([a-fA-F0-9]{1,4})){1,5})|((([a-fA-F0-9]{1,4}):){3}(:([a-fA-F0-9]{1,4})){1,4})|((([a-fA-F0-9]{1,4}):){4}(:([a-fA-F0-9]{1,4})){1,3})|((([a-fA-F0-9]{1,4}):){5}(:([a-fA-F0-9]{1,4})){1,2})) |
Description |
This RE recognizes IPv6 addresses for all the representations described by RFC 2373:
1) extended format (with both upper and lowercase HEX)
2) compressed format (eg 2001::6:a)
3) IPv4-embedded format (eg ::ffff:1.2.3.4) limited to addresses of the traditional dual-stack configuration
Resulting from observation of real-world implementations, case 2) is extended to allow "::" for one 0-group alone. Although the RFC is clear on "::" being for "multiple groups of 16-bits of zeros" only, some tools like "dig" for the mac produce those values.
The RE is simple and quite elegant; it has been tested on over 300 IPv6 addresses collected by dig-ing IPv6-enabled domains; it is used in sshguard's log parser, see http://www.sshguard.net . |