Last modified: 2013-04-08 11:02:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16977, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14977 - $wgServer lacks brackets in IPv6 URLs
$wgServer lacks brackets in IPv6 URLs
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.14.x
PC Linux
: Low major (vote)
: ---
Assigned To: Nobody - You can work on this!
http://[2a01:e0b:1:47:240:63ff:fee8:c...
: accessibility, ipv6
: 21365 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-29 22:50 UTC by Gilles Bedel
Modified: 2013-04-08 11:02 UTC (History)
14 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Source of http://[2a01:e0b:1:47:240:63ff:fee8:c3a]/index.php/Accueil (13.57 KB, text/html)
2008-07-29 22:50 UTC, Gilles Bedel
Details

Description Gilles Bedel 2008-07-29 22:50:57 UTC
Created attachment 5103 [details]
Source of http://[2a01:e0b:1:47:240:63ff:fee8:c3a]/index.php/Accueil

I use Linux 2.6.24-gentoo-r3 on Gentoo release 1.12.11.1 with PHP 5.2.6RC4-pl0-gentoo, and mediawiki SVN r38180.

My mediawiki website is unaccessible from the given URL. The browser try to access the path "/" and is redirected to
http://2a01:e0b:1:47:240:63ff:fee8:c3a/index.php/Accueil, which lacks the brackets.
Here are the HTTP headers sent:

> GET / HTTP/1.1

> Host: [2a01:e0b:1:47:240:63ff:fee8:c3a]

> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008062122 Minefield/3.0

> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

> Accept-Language: fr-fr,fr;q=0.8,en-us;q=0.5,en;q=0.3

> Accept-Encoding: gzip,deflate

> Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

> Keep-Alive: 300

> Connection: keep-alive

> 

> HTTP/1.1 301 Moved Permanently

> Date: Tue, 29 Jul 2008 14:06:12 GMT

> Server: Apache

> X-Powered-By: PHP/5.2.6RC4-pl0-gentoo

> Vary: Accept-Encoding,Cookie

> X-Vary-Options: Accept-Encoding;list-contains=gzip,Cookie;string-contains=wikidbToken;string-contains=wikidbLoggedOut;string-contains=wikidb_session

> Expires: Thu, 01 Jan 1970 00:00:00 GMT

> Cache-Control: private, must-revalidate, max-age=0

> Last-modified: Tue, 29 Jul 2008 14:06:12 GMT

> Location: http://2a01:e0b:1:47:240:63ff:fee8:c3a/index.php/Accueil

> Content-Encoding: gzip

> Content-Length: 26

> Keep-Alive: timeout=15, max=100

> Connection: Keep-Alive

> Content-Type: text/html; charset=utf-8

As you can see, the "Location: ..." line is wrong. I searched why, and discovered that all the full URLs are contructed using $wgServer, which equals to "http://2a01:e0b:1:47:240:63ff:fee8:c3a" in the example above.
This variable is built by includes/DefaultSettings.php, thanks to $_SERVER['SERVER_NAME'], which equals to "2a01:e0b:1:47:240:63ff:fee8:c3a".

In short, we need to add brackets to IPv6 URLs. All the URLs of the website that include the server IP are wrong, such as the RSS and ATOM links. No problems with IPv4 (88.191.47.48).

I'm wondering if we can consider that this bug would be redirected to PHP, as the $_SERVER['SERVER_NAME'] variable is involved into URL building 99% of the time. Also, correcting our URLs would involve a heavy test on the variable to determine if it's an IPv6. However, $_SERVER['HTTP_HOST'] (the "Host: ..." line of the browser) could be used because it contains the brackets.

I set the bug severity to minor, because navigation from the main page is still possible, as many of the href links do not include the server IP. I also attached the wiki main page source to let you see what's in if you have not an IPv6 access.
Comment 1 Brion Vibber 2008-07-29 22:53:53 UTC
*shudder*

Ideally this shouldn't happen very much but yes, we should handle the case properly.
Comment 2 Chad H. 2008-07-30 03:25:30 UTC
Fixed in r38214. Ideally, PHP would the IP with the brackets around it (if that indeed is the standard, I know nothing about IPv6, I'm afraid). In any case, DefaultSettings now checks to see if $wgServer is an IPv6 address and wraps it accordingly, prior to adding https:// and any port numbers.
Comment 3 Gilles Bedel 2008-07-30 12:44:15 UTC
Thanks for your patch ^demon, it worked well :)

Brackets in IPv6 were introduced by the RFC 2732 http://www.ietf.org/rfc/rfc2732.txt, because the ':' were already used te separate the port number from the server name/IP. The brackets are URL specific, for instance they are not used in the DNS lines:

> $ dig AAAA www.kame.net
> www.kame.net.           86400   IN      AAAA    2001:200:0:8002:203:47ff:fea5:3085

IMHO, thinking in a logical way drive me to say that $_SERVER['SERVER_NAME'] would stay the server name, because it's the name of the variable. So it would be the IPv6 address, without the brackets.
However, adding them in $_SERVER['SERVER_NAME'] would greatly simplify things, as I believe MediaWiki is not the only software with that issue.
Comment 4 Brion Vibber 2008-07-30 19:34:16 UTC
Looks like this'll still break if you're also on a non-standard port due to the check for ':' below:

# If the port is a non-standard one, add it to the URL
if(    isset( $_SERVER['SERVER_PORT'] )
	&& !strpos( $wgServerName, ':' )
    && (    ( $wgProto == 'http' && $_SERVER['SERVER_PORT'] != 80 )
	 || ( $wgProto == 'https' && $_SERVER['SERVER_PORT'] != 443 ) ) ) {

	$wgServer .= ":" . $_SERVER['SERVER_PORT'];
}
Comment 5 Brion Vibber 2008-07-30 20:17:03 UTC
Reverted r38214 due to failures to load IP class on live Wikimedia sites.

Couldn't narrow it down to any particular action I could find, but it was spewing the logs horribly:

Jul 30 20:15:14 srv126 httpd[24718]: PHP Fatal error:  Class 'IP' not found in /usr/local/apache/common-local/php-1.5/includes/DefaultSettings.php on line 74 


Autoloader should be set up in WebStart.php before LocalSettings.php gets called (which then includes us), so there *shouldn't* be a problem here, but...
Comment 6 Chad H. 2008-07-30 23:27:19 UTC
That's odd. Good you reverted though, because it also caused a regression (just found this out) in the installer too, as the Autoloader _definitely_ hasn't loaded up the IP class at this point of execution.
Comment 7 Chad H. 2008-07-31 13:42:50 UTC
Got to thinking: What if the IP class was explicitly included in DefaultSettings? Then making use of IP::isIPv6() and friends would make $wgServer tweaking more doable (during setup and actual usage). Not sure about the overhead though, I would think it would be negligible.
Comment 8 Tim Starling 2008-08-06 16:05:47 UTC
DefaultSettings.php should be for data, not code. Set $wgServer to some special value and put the code in Setup.php.
Comment 9 Gilles Bedel 2008-08-06 20:02:01 UTC
Tim, it would be a bit hackish... If DefaultSettings.php is strictly for data setup (i.e. only variable assignment), we wouldn't build $wgServer in it. So what about moving the entire $wgServer building code in Setup.php? Of course if that not make a regression again...
Comment 10 Daniel Friesen 2008-08-06 20:21:43 UTC
Uhm... what about all the wiki host type people running on a single installation?
Where they break up a bit of $wgServerName in order to find out the id of the individual wiki to use configuration for?

Wouldn't moving the server name code out of DefaultSettings.php be a regression to most of the people using that fairly common technique of combining multiple wiki into one installation?
Comment 11 Tim Starling 2008-08-07 01:12:37 UTC
(In reply to comment #10)
> Uhm... what about all the wiki host type people running on a single
> installation?
> Where they break up a bit of $wgServerName in order to find out the id of the
> individual wiki to use configuration for?
> 
> Wouldn't moving the server name code out of DefaultSettings.php be a regression
> to most of the people using that fairly common technique of combining multiple
> wiki into one installation?
> 

I'm not aware of any such code. On all the wiki hosts I've written or worked with, superglobals are used for detection, and $wgServer is set, not read.

If it really is necessary to set $wgServer in DefaultSettings.php, I would suggest doing it without loading massive amounts of code:

if ( preg_match( '/^(:(:([0-9A-Fa-f]{1,4})){1,7}|([0-9A-Fa-f]{1,4})(:{1,2}([0-9A-Fa-f]{1,4})|::$){1,7})(\/(12[0-8]|1[01][0-9]|[1-9]?\d)|)$/', $wgServerName ) ) {
     $wgServerName = '[' . $wgServerName . ']';
}

Comment 12 Gilles Bedel 2008-08-07 06:23:27 UTC
Tim, I see you copied that from IP::IPv6()... Why not after all.
Alternatively PHP provides a builtin function for IPv6 validation, here is an example:
  http://www.w3schools.com/PHP/filter_validate_ip.asp

But I don't know about the overhead it may cause. And filter_var() needs PHP >= 5.2.0...

And for port problem pointed out by Brion, I suggest that:

if( isset( $_SERVER['SERVER_PORT'] )
    && (    ( $wgServerName{0} != '[' && !strpos( $wgServerName, ':' ) )
         || !strpos( $wgServerName, ']:' ) )
    && (    ( $wgProto == 'http' && $_SERVER['SERVER_PORT'] != 80 )
         || ( $wgProto == 'https' && $_SERVER['SERVER_PORT'] != 443 ) ) ) {

        $wgServer .= ":" . $_SERVER['SERVER_PORT'];
}
Comment 13 Antoine "hashar" Musso (WMF) 2008-08-31 19:42:47 UTC
Using Debian's lighttpd 1.4.19-4, SERVER_NAME is build even when using a port such as 777.

[SERVER_NAME] => [2a01:e35:2eb4:xxxx::xxxx]:777

There is an apache bug report opened :
https://issues.apache.org/bugzilla/show_bug.cgi?id=26005
Comment 14 Philippe Verdy 2008-11-14 01:41:09 UTC
Those IPv6 designers should have better thought twice before choosing the colon as the separator for groups of digits in litteral numeric addresses for hosts. OK for the extra [] brackets, by why the colons and not dots like in IPv4?

They wanted to supported a dotted 4-figures syntax for the 32 least significant digits of the 128-bit IPv6 address, to make the distinction with the hexadecimal notation, just to allow easy conversion of IPv4 to IPv6 (by preprending the 96-bit hexadecimal prefix, that can have a variable number of colons), so this would have created an ambiguity if they had used the same dot between hexadecimal address parts, and between the four 8-bit decimal components specifying the least significant 32 bits.

But why did not they specify that [brackets] are REQUIRED part of the hostname to create from an IPv6 address. This would have avoided the two representations (with or without brackets) for the hostname, and getHostName() would have never returned an address in the form "2001:800:xxx:xxx" but only "[2001:800:xxx:xxx]"

Instead they chose to make the hostname without any bracket, requiring that applications like URLs to support the addition of [] to avoid the ambiguiy with a port number (the ambiguity exists only because valid IPv6 addresses can be valid with less than 8 hex parts and 7 colons.
Comment 15 Alexandre Emsenhuber [IAlex] 2009-10-31 20:14:43 UTC
*** Bug 21365 has been marked as a duplicate of this bug. ***
Comment 16 Allen Stambaugh 2010-09-14 03:41:52 UTC
Would someone take a look at the page I created at http://www.mediawiki.org/wiki/User:Allen4names/Bug_14977 ? I am hopping that this will help resolve this bug.
Comment 17 Philippe Verdy 2010-10-01 18:23:56 UTC
It would be much simpler if $_SERVER['SERVER_NAME'], which equals to "2a01:e0b:1:47:240:63ff:fee8:c3a",
was simply replaced by adding surrounding square brackets very early every time it contains colons.

My opinion is that a server name shoud always be delimited, independantly of where it is used.

Then no more special code to test anywhere else... except if this value is used in a DNS resolver such as gethostbyname(), which may eventually fail if it does not accept the square brackets (my opinion is that such API should ALWAYS accept these brackets.

So check for occurences of

* gethostbyname(String name) or gethostsbyname(String name), which return one or several addresses (IPv4, IPv6, or other) from a hostname, using the locally configured name resolver (local hosts file, DNS, WINS, NetBios, or other service) so that it will reformat all IPv6 addresses in the returned set within brackets. Normally the returned addresses by this socket API are in binary format, but its PHP binding serializes it into a string, forgetting the bracket. However, in most installations of PHP, the variable is set by Apache in an environment string, and Apache (and other servers like IIS, or ISAPI and other interfaces) also forgets to adds the brackets for IPv6 addresses (there's an old bug about this, this was not changed because now too many applications expect to detect themselves the colons and add the brackets automatically : we should alsp do the same, by using these APIs indirectly via a proxying compatibility function).

* gethostname(sockaddr_t address), which attempts to retrieve a hostname from the specified address (in binary IPv4 or IPv6 format, or other). The API normally expects an address in binary format (not restricted to IPv4 or IPv6 only, for example a NetBEUI "address" which is a short ASCII-only string).

In all cases, the occurences of ":" or "/" in a canonical hostname (as used in the socket API or received from the webserver API or environment variables) should be blackboxed by inserting it within brackets, otherwise if these characters are not present, then NO surrounding bracket should be present. The algorithm is described in the RFCs describing URI formats and how to encapsulate a hostname into a valid URI.

Note: MediaWiki should NOT be restricted to be used over Internet, it should be compatible with various network protocols to reach the webserver, not just IPv4 or even IPv6, so it should be neutral about which address type is effectively used: it should even work over a LAN with NetBEUI addresses (with the additional convenience that network datagrams not using an IP protocol are easily firewalled, without complex rules, or within a completely separate domain administration, as NetBEUI addresses will not pass through any router, unless it is proxied or transported over a private tunnel, usually secured and encrypted).

So solution is effectively to blackbox the content of $_SERVER variable, i.e. the environment variables set by web servers (or FastCGI slave servers) when instanciating PHP, or the socket APIs used through the PHP APIs, before even setting our "$wgServer" variable from these values.
Comment 18 Philippe Verdy 2010-10-01 18:50:25 UTC
Note to Allen: your regexp test is too much complex, and not sufficient:

* any occurence of a colon (:) or slash (/) or question mark (?) or number sign (#) in the hostname should enforce the brackets to be added to surround it.

* and if the hostname already contains brackets anywehere (except at the start and end only), these characters should be URI-encoded (this may occur in other transport protocols than IP, such as NetBEUI, possibly also AppleTalk, Token Ring, and various OSI/ISO transport or link layers).

Yes we know that most networks have adopted the IP protocol, including for their private LAN (because of the ease of deployment and compatibility with routers, proxies, tunnel servers, firewalls, and network administration tools, also because this protocol is the most widely scrutinized one for security and performance tuning, and supports in the widest range of devices... But these protocols continue to exist in limited private domains.

There's even a way to reach a server on the link layer only (using an Ethernet address to enforce the link-local only restriction): an Ethernet address also typically contains colons (it's a 48-bit address also formatted in hexadecimal 16bit blocks), and such topology does not use any port number (instead it uses a protocol number).

I know one example of private network where the IP protocol is reimplemented using a private protocol number over Ethernet, in that case URL formats are like this:

eth://[C000:0000:0000]:9999:80/path

where "C000:0000:0000" is the Ethernet address, "9999" the private protocol number used, and 80 is the port number used in that protocol. For this case, the $_SERVER[SERVER_NAME] contains "C000:0000:0000" (remapped to "[C000:0000:0000]" in applications), and $_SERVER[SERVER_PORT] contains "8888:80". To deploy such protocol, one needs to use raw sockets that are administratively restricted in OSes, so that they will check the protocol number used (to avoid collisions with IPv4, IPv6, and a few other wellknown protocols such as ICMP or gateway-to-gateway protocols or router administration protocols, for which the standard socket layer should be used instead), before deciding what to do with the rest of the port specification).
Comment 19 Allen Stambaugh 2010-10-02 07:03:19 UTC
Philippe

MediaWiki is a web application an since some web servers cause $wgServerName to include a port number you need to expect a single colon. For example...

www.example.org:8080

If you want MediaWiki to support the ethernet protocol you should make a feature or extension request.
Comment 20 Philippe Verdy 2010-10-02 21:21:42 UTC
That was not the purpose of my message. Actually it ALREADY works with the raw Ethernet protocol, when the web server is already configured to accept this protocol, and PHP works as a slave server (it does not directly access to the protocol which is only controled by the webserver).

But what I wanted to say is that the RFC-described syntax of URIs still applies and that MediaWiki should just comply to the most generic syntax of URIs. Minor patches are needed (this is much more complex for supporting the raw sockets in PHP itself, or in the webserver, due to evident security issues or conflicts with binding of IP and ICMP protocols into other OS kernel drivers that will restrict their use.

Yes $wgServer may contain a colon, but only after parsering $_SERVER[SERVER_NAME] before appending $_SERVER[SERVER_PORT] with an intermediate colon.

The blackboxing will occur when handling $_SERVER[SERVER_NAME] only (which should then never be used elsewhere in the MediaWiki's PHP code).

For Ethernet (this was just an example) you would have found:
$_SERVER[SERVER_PROTOCOL]="http:"
$_SERVER[SERVER_NAME]="C000:0000:0000" and
$_SERVER[SERVER_PORT]="9999:99" (i.e. the port may not be just an integer, here it indicates a protocol number and an optional port, as supported by the underlaying protocol handler, out of the Ethernet handler itself).

All that needs to be done is to put the SERVER_NAME between brackets, what you get is [C000:0000:0000] and it looks very similar to an IPv6 hostname, except that it uses a shorter bit pattern (with 3 and only 3 grouos of hex numbers), so there's no confusion with an IPv6 hostname (which would require extra colons).

In that case $wgServerName will be "[C000:0000:0000]:9999:99", and you can build the HTTP protocol on top of it (HTTP is not restricted to IP transport only, it just depends on an unspecified underlying reliable bididirectional end-to-end transport layer, for example TCP, or even UDP in some cases where it may be reliable, or some other transport protocol built on top of UDP, or a serial link, or a named pipe, or an Unix pipe, or any kind of interprocess communication channel, but not IP-Multicast because it is not end-to-end)

All the syntaxes that allow creating a valid URI from protocol specified, hostnames, port numbers, or transport options should be specifiable in $wgServerName (because it's not up to MediaWiki to control these options, but to the webserver hosting or relaying PHP. Mediawiki can perfectly be protocol-agnostic, it just has to assume that the underlying protocol will support any kind of hierarchic URI schemes.

That's why the server APIs gave you the $_SERVER array: these variables are not meant to be simplified in a single $wgServerName, but if you do that, you have to take precaution so that the transform from $_SERVER[] to $wgServerName remains bijective (within the allowed limits of URI RFCs).
Comment 21 Allen Stambaugh 2010-10-08 01:30:42 UTC
Philippe

Your opinions aside there is no need for MediaWiki to support the use of ethernet addresses within $wgServerName. Unless you have something constructive to add concerning this bug you need not respond.

To review. MediaWiki is designed for use on a web server using PHP and a SQL database. (MySQL, PostgreSQL, etc.) This bug concerns the lack of brackets around IPv6 addresses. Because MediaWiki is designed to be dependent (by default) on IP, and IPv4 address depletion is expected sometime in 2011 this bug should be resolved soon.
Comment 22 Chad H. 2010-10-08 01:48:19 UTC
(In reply to comment #21)
> Your opinions aside there is no need for MediaWiki to support the use of
> ethernet addresses within $wgServerName. Unless you have something constructive
> to add concerning this bug you need not respond.

Could've said it a bit nicer, but...I agree wholeheartedly, there is no reason for us to start supporting every possible protocol under the sun. Without a solid use case, it simply is not worth the effort.

> To review. MediaWiki is designed for use on a web server using PHP and a SQL
> database. (MySQL, PostgreSQL, etc.) This bug concerns the lack of brackets
> around IPv6 addresses. Because MediaWiki is designed to be dependent (by
> default) on IP, and IPv4 address depletion is expected sometime in 2011 this
> bug should be resolved soon.

Yes, it should be fixed (sooner rather than later). Tagging this with +bugsmash. Shouldn't be *too* hard to crank out a solution over the course of that weekend.
Comment 23 Chad H. 2011-02-12 20:38:42 UTC
Bumping priority a bit. In doing some local testing, one major issue immediate becomes clear:

When no title is specified in the URL (for example, http://[::1]/wiki/), OutputPage::output() will try to 301 redirect you to http://::1/wiki which does not work. This goes for anything using OutputPage::redirect() really.

If you manually set $wgServer in LocalSettings (eg: 'http://[::1]'), then everything works without a hitch as far as I can tell.

(In reply to comment #13)
> There is an apache bug report opened :
> https://issues.apache.org/bugzilla/show_bug.cgi?id=26005

Yes. This is really Apache (IIS? Lighty? Nginx?) giving us a raw ::1 without brackets in SERVER_NAME, which we choose in DefaultSettings over HTTP_HOST (which does have the brackets, interestingly, and works if I switch it).

Maybe we should switch our order of preference here. I saw a table comparing $_SERVER variable support between the different webservers some time ago, but I can't seem to dig it up at the moment.
Comment 24 Antoine "hashar" Musso (WMF) 2011-02-13 10:41:29 UTC
One way would be to check the SERVER_NAME variable for columns, and enclose it in brackets automatically.
Comment 25 Philippe Verdy 2011-03-05 00:12:48 UTC
Extensive tests must effectively handled now in high priority. In the next few months, we'll start experiencing with the problem of users that won't be able to use Wikimedia sites in some countries, just because they won't be able to have a stable enough IPv4 address (the Ipv4 support will be through proxies, and there will be problems in validating the proxies to make sure that they effectively handle the proxying signaling on their HTTP requests, ot identify sessions), and that won't also be able to use HTTPS for strong identification to paliate this problem.

Promoting the use of IPv6 should also be an easy alternative if their temporary (and shared) IPv4 address is compromized (by some unrelated abusers), because it won't be acceptable to block IPv4 addresses without making sure that the IPv6 support is there and working for those users.

If we don't act now, we will be left without easy solution to fight against abuses. Those users may eventually still use HTTPS, but it will have a severe performance impact on servers, if HTTPS usage suddenly increases.

So there's an immedaite need to test for complete support at least for the major Mediawiki servers, notably those from Wikimedia due to their huge worldwide traffic, but also all other services that should be candidate for testing their deployment if they use another webserver system than Apache. The tests should then include IIS, and progresivly all webservers that support PHP builds (including  the various incarnations of FastCGI), on Linux, BSD, Windows (IIS), or application servers (Oracle, IBM Websphere...).

Now the minutes start being counted. The "Bug Bang" is about to appear, do we have to wait for a major connectivity failure or the start of major abuse attacks through 6to4 or Teredo servers or many proxies that we'll consider as unsecured openrelays throughout the world ? Are we ready to support a suddent increase of use of HTTPS? And the loss of support of HTTP/1.1 sessions for a suddent increase of isolated HTTP sessions (one for each request)?

Note how the various ISPs worldwide are very late in deploying IPv6, the only thing they have tested for now are Teredo or 6to4 relays, but only for a small part of the traffic, as they think that almost everything is cachable and sharable to support the load. This may be true for media delivery sites (including images, sound and videos), but not for interactive sites like Wikimedia and all wikis and blogs in general.

The largest interactive sites are already OK for IPv6 (including Google, Yahoo, Microsoft, Facebook). If we don"t resolve this connectivity problem very soon, the Wikimedia popularity could extremely rapidly slow down (remember that it not only depends on individual users, but also on their capability of socializing on these sites; if a friend or important team member can no longer participate easily, due to the technical measures that its ISP may take to appliate the lack of Ipv4 addresses to support all their users, or due to the increased cost for their IPv4 address pool, and a drastic reduction of this pool, as this seems the case, given that they seem to favor solutions like LSN = Large-Scale-NAT) it could be catastrophic for Wikimedia.

Note that some interactive networks have already chosen to stop using MediaWiki for their wikis (note the suddent increase of Mediapress, which also offers more interesting 2.0 social features, and simpler administration from users, and a much less important role for superusersn and better syndication mechanisms).
Comment 26 Philippe Verdy 2011-03-05 00:51:16 UTC
Note: if the canonical IPv6 address format (with its colons) causes so much compatibility problems to fix, why not converting it to a DNS compatible address format in the .arpa domain ?

This provides an immediate fix as this special IPv6 domain resolves immediately all fully specified IPv6 addresses in this domain, without needing any request or prior registration to any DNS server (this is resolved locally, without needing any status response from a DNS server, as the only acceptable AAAA response from a DNS server, if successfull, can ONLY be the same IPv6 address, and everything else must be marked as an incorrect bogous response, for obvious security reasons).

You only need to perform a DNS request if the domain name specified in the special ipv6 .arpa is only partly specified (i.e. missing one or more digits). IP resolver libaries can also validate the format of names in this special ipv6 .arpa domain. They are also doing this local resolution for obvious performance reasons. A true DNS request is only needed to request something else than a AAAA record (for example an ANY request, or the request for the highly recommanded, but still optional, associated CNAME).

We don't really need this CNAME to forward requests from a webserver to a slave PHP or FastCGI process or from a caching proxy to a webserver running in a firewalled local network (like on Wikiemdia sites).

Can this be tested on BIND, or in the Windows, OSX, Linux, and Unix IP resolvers, i.e. through the socket/Winsock API gethostbyname("(...).arpa"), to see if they really need to perform a DNS request to resolve those IPv6 addresses fully specified in this special ipb6 .arpa domain ? If it works, it could help solve immediately some integration problems with various softwares that need an update to support the brackets in hostnames build from the canonical IPv6 address format (or any one of its abbreviated formats using "::" and stripping any other leading zeroes)?

Can we enumerate all the integration problems that need to be fixed (by various developer teams, not necessarily synchronized between each others). Of course, bug tickets must be extensively opened to all these teams. Wikimedia sites are large enough and deployed worldwide, that it could boost all other teams to corect their own softwares.

This ticket is really slow, it has not moved since more than 2 years. Don't blame too much the ISPs of not pushing IPv6 for now to all users. This is a chicken and egg problem : everyone must advance as soon as possible in this area, without waiting for others to fix their own bugs/limitations.

Like it or not, the world is going to use mobile networks increasingly, and the number of IP-enabled devices for each user is exploding worldwide. On mobile internet, IPv6 is already widely deployed (sometimes without any support for IPv4, or with extremely severe protocol restrictions, allowing only access to cachable shared contents through HTTP proxies, and very slow connection on SSL/TLS, or not allowing the usage of SSL/TLS for something else than identification, or small commercial transactions through HTTP form data POST requests limited in volume for both the data sent and in the reply, and string limitations in the number of those requests forwarded by users with HTTPS over their proxies !).

This means that HTTPS will not work for interacting in Wikis (this is already a problem for social networks, and that's why most mobile networks have integrated their own proprietary support platforms for supporting Facebook, Twitter, or MSN, and sometimes even require additional subscriptions in their data plans, plus a compatible terminal, whose firmware has been specially tweaked to allow the interaction).

There's no alternative other than full IPv6 support as soon as possible. Otherwise there's the huge risk of fragmentation of the Internet as an unified open and interoperable network (these risks are already measurable today, interoperability is already decreasing, with lots of black holes onthe Internet and sever access restrictions, much less dependant of political decisions, but a lot influenced by very bad commercial practices, and lazyness from commercial services to change their technical platform, or to invest in its evolutions).
Comment 27 Alexandre Emsenhuber [IAlex] 2011-04-02 19:36:19 UTC
Back to General/Unknown, this is not an installation bug since $wgServer is discovered on each request and not only once in the installation process.
Comment 28 Philippe Verdy 2011-04-04 16:09:54 UTC
You don't need to disvover it on each request, it can be detected at installation time because this is a property of the web server software running the PHP instance where MediaWiki is installed. And this property can effectively be tested automatically at installation time (you just need to check if the site has IPv6 connectivity, and you can perfectly run a client test against that server, going to a small PHP script returning the value sent to the PHP environment variables (you don't even need to perform the full installation of MediaWiki to perform this discovery test, all you need is to check that PHP is available and runs ; to test it, just perform a connection from the setup shell, using a localnet IPv6 address).
Another note: the port number used for IPv6 to reach the web server is not necessarily the same as the port number used for IPv4. To reach the webserver in order to install MediaWiki, you first need to check the connectivity. In addition, in some webservers, you may have distinct IPv4 and IPv6 connectivity between various hosted domains, due to virtualization (notably by access proxies).
When MediaWiki is installed, it should be in a state where users cannot connect or see any content or create an account, until the site admin has run an integration test and enabled the wiki publication. So there's normally a post-install test script to run, to test the various front-proxy/webserver/PHP/MediaWiki settings (including security restrictions and .access files).
Comment 29 Chad H. 2011-04-04 17:48:06 UTC
(In reply to comment #28)
> When MediaWiki is installed, it should be in a state where users cannot connect
> or see any content or create an account, until the site admin has run an
> integration test and enabled the wiki publication. So there's normally a
> post-install test script to run, to test the various
> front-proxy/webserver/PHP/MediaWiki settings (including security restrictions
> and .access files).

What in blazes are you going on about?
Comment 30 p858snake 2011-05-10 06:06:21 UTC
(In reply to comment #28)
> When MediaWiki is installed, it should be in a state where users cannot connect
> or see any content or create an account, until the site admin has run an
> integration test and enabled the wiki publication.
No... That quite frankly is a stupid idea.
Comment 31 Philippe Verdy 2011-05-10 06:12:51 UTC
(In reply to comment #30)
> No... That quite frankly is a stupid idea.

Moderate your words ! This is certainly not "stupid" (sic!) to check the security by running a series of assertion tests to detect deficiencies before making the site online. If you don't do that, the wiki admins will only know that there will be security problems when it will already have been compromized. So a post-install test suite should run to generate a report, allowing wiki admins to fix their setup before making the site fully online, using the provided wiki admin tools, that will first show him the report.
Comment 32 Chad H. 2011-05-10 15:23:13 UTC
Then open a new bug, it has absolutely *nothing* to do with adding brackets to IPv6 addresses.

(And also, basic security checks like register_globals, 5.3.1, etc are done by the installer anyway...)
Comment 33 Philippe Verdy 2011-05-10 17:14:41 UTC
Impolite, once again...

When you don't want to read and just give insults, I won't follow your advice, others than you will read, and IPv6 support is not just a RFE, but we have seen above that this was a security issue, and that full support for it required compatibility and security tests, depending on the webserver used and on the compiled in the PHP version used on that server.

IPv6 is not already deployed and is the only safe alternative to connect to many wiki sites (not just Wikimedia), because many users have limited support for sessions in IPv4 (this works for browsing, but not very well for contributing/editing), or need to pass through shared proxies provided by their ISP (that cause lots of problems on large wikis, notably abuses that can't be simply filtered by their IPv4 address). This includes now many users connected via broadband mobile networks.
Comment 34 Antoine "hashar" Musso (WMF) 2011-05-11 03:24:08 UTC
Looks like I partly fixed this bug some weeks ago.

r83847 - commit log:
------------8<---------------------8<--------------------
setting servername with an IPv6 request must ensure we have both brackets

On lighttpd 1.4.28, the SERVER_NAME CGI variable is truncated at the first
colon. This makes it return an incorrect value for SERVER_NAME when the user
make the request to an IPv6, it outputs something like [2001.

This patch make sure we have either both opening and closing brackets or no
brackets at all (hence the 'xor' boolean check).
------------8<---------------------8<--------------------
Comment 35 Allen Stambaugh 2011-11-20 03:23:50 UTC
Unless I am mistaken this bug will be closed with or after the release of version 1.18. The version 1.18.0rc1 installer sets $wgServer in LocalSettings.php. I think that automatic setting of that variable will be depreciated. Until then I suggest adding the following code to LocalSettings.php and if necessary modifying it to work with your web server.

## This sets $wgServer including the protocol. Uncomment out the last line if you need to use different port number(s).
#$wgServer = ( isset( $_SERVER['HTTPS'] ) && $_SERVER['HTTPS'] == 'on' ) ? 'https://'.$_SERVER['SERVER_NAME'] : 'http://'.$_SERVER['SERVER_NAME'];
#$wgServer = ( isset( $_SERVER['HTTPS'] ) && $_SERVER['HTTPS'] == 'on' ) ? 'https://'.$_SERVER['HOSTNAME'] : 'http://'.$_SERVER['HOSTNAME'];
$wgServer = ( isset( $_SERVER['HTTPS'] ) && $_SERVER['HTTPS'] == 'on' ) ? 'https://'.$_SERVER['HTTP_HOST'] : 'http://'.$_SERVER['HTTP_HOST'];
#$wgServer = ( isset( $_SERVER['HTTPS'] ) && $_SERVER['HTTPS'] == 'on' ) ? 'https://'.$_SERVER['SERVER_ADDR'] : 'http://'.$_SERVER['SERVER_ADDR'];
#$wgServer .= ( isset( $_SERVER['HTTPS'] ) && $_SERVER['HTTPS'] == 'on' ) ? ':8443' : ':8080';

For reference I am leaving a link to the Lighttpd bug page.

http://redmine.lighttpd.net/issues/2333
Comment 36 Philippe Verdy 2011-11-23 08:16:42 UTC
I also confirm the low priority here. Most wiki websites, including those only for Intranet use, will use a domain name, and not refer to the server using an IP address, should it be IPv4 or IPv6. So this is a non issue for the creation of URLs.

All large web sites anyway won't want to be refered to by IP address but only by domain name, to help balance the load into multiple servers or front proxies, with the help of the DNS system.

This is also needed in case of change of web hosting providers, or if there are alternate providers, or if for some reason a server must be stopped or fails and the trafic redirected to some other server: IP addresses are not intended to be stable across time (the myth of "static IP address" has lived, even in absence of NAT routing. Everyone uses a domain name because it is really a cheap solution with lots of benefits.

It is only on issue if there are limitations in the webserver software itself (but IPv6 support in Apache, or even IIS, is present since long), or in some server log analysis tools, or with some proxy softwares (no longer an issue for proxies used by the Wikimedia Foundation). We should really not recommand the installation of MediaWiki on servers without a domain name, or at least a hostname on a private LAN (and on private LANs, IPv4 still works without limitations; IPv6 is mostly for the Internet).
Comment 37 Allen Stambaugh 2011-11-23 18:57:42 UTC
(In reply to comment #36)
> It is only on issue if there are limitations in the webserver software itself
> (but IPv6 support in Apache, or even IIS, is present since long), or in some
> server log analysis tools, or with some proxy softwares (no longer an issue for
> proxies used by the Wikimedia Foundation). We should really not recommand the
> installation of MediaWiki on servers without a domain name, or at least a
> hostname on a private LAN (and on private LANs, IPv4 still works without
> limitations; IPv6 is mostly for the Internet).

With some exceptions I agree, but we should allow for those who wish to develop content over the Internet before adding a domain name and for testing and experimentation.
Comment 38 Bawolff (Brian Wolff) 2011-11-23 19:06:52 UTC
Tim fixed this a while back (r90105) and presumably forgot to mark this bug as fixed.


Marking fixed.
Comment 39 Philippe Verdy 2011-11-24 05:30:57 UTC
Domain names are free if you want to try from your home, dyndns.org can provide you a domain name for such tries if you want to see how your website behaves from the Internet.

But for most tests, you just need to use a local hostname. Or you can experiment it from the same host as the webserver used for development and then you can reference it as 127.0.0.1 (yes only IPv4), or by defining a hostname entry in your local /etc/hosts to configure a name associated with an IPv6 address (this workds also on Windows).

Really, I hope I will never see any site using URLs containing a [bracketed:IP:v6::address] for its hostname. If such URLs exist, this is only for the configuration pages to setup devices on the local network only, before the device itself gets assigned a hostname. But everything connected on the Internet should have a hostname (including mobile devices: it's up to the ISP to designate a usable domain name in its reverse IP DNS database).

And even if there's still no hostname assigned or it would be best to still use the hostname from the special .arpa TLD, even if it creates names that are longer (64 characters + the special subdomain in .arpa for IPv6 registrations, although I think that the .arpa registrations will never go above the first 64 bits of the IPv6 address, the rest being private, so that only 32 characters would be needed for the registered IPv6 address block, plus the lenght of the .arpa subdomain for IPv6, and in prefix we can simply use the compact hex representation of the remaining 64-bits, creating a 8-hex digit label). If you have a local DNS server, it will remap this 8-hex digit label into subnetwork labels if needed, or it will assign the true domain name you want for your website.

In other words: we really don't need to support IPv6 hostnames for the local server, but we need it only in the parser for
- external servers (which are not necessarily webservers running MediaWiki, but may be some hardware devices, including those found by default host discrovery mechanisms, such as configurable routers, or unconfigured network printers or TV/media decoder sets...),
- or much more commonly for the identification of remote clients (in the server logs for example, or in the early stages of setting up a secured channel when one client wants to hide his public identity, for privacy reasons, or for strict confidentiality reasons, but only wants to reveal it to the server on the secure channel after it has been established on an incoming IPv6 request).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links