Last modified: 2014-05-30 06:17:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T24709, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 22709 - IIS7.5 is mishandling redirects generated by OutputPage::output() when the URL contains special characters
IIS7.5 is mishandling redirects generated by OutputPage::output() when the UR...
Status: REOPENED
Product: MediaWiki
Classification: Unclassified
Redirects (Other open bugs)
1.16.x
All Windows 7
: Low major (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-reviewed, upstream
Depends on:
Blocks: iis
  Show dependency treegraph
 
Reported: 2010-03-03 01:06 UTC by Lisa Ridley
Modified: 2014-05-30 06:17 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Lisa Ridley 2010-03-03 01:06:16 UTC
This issue was originally reported on the MediaWiki Users forums in this thread:

http://www.mwusers.com/forums/showthread.php?14312-MediaWiki-and-IIS7-URL-bug

The original reporter tested this on MW version 1.13.2 and 1.15.1, using PHP version 5.2.9 using IIS7.5 on Microsoft Server 2008.  I have duplicated this behavior on a VMWare virtual machine running Windows 7 Ultimate, using IIS 7.5 and PHP version 5.2.12 with MW version 1.15.1 and 1.16alpha.

The test urls used to generate this error are:

http://localhost/w/index.php?title=á:á
http://localhost/w/index.php?title=á:Test

The issue in question is the presence of special characters in the title name when a colon is also present and there are special characters preceding the colon.

When MediaWiki encodes the title and sends redirect headers in OutputPage::output(); IIS7.5 is returning a 404.0 error, and IIS7.5 indicates that the requested URL is in the format:

http://localhost:80/w/index.xn--php?title=-14a:%C3%A1 (first url above)
http://localhost:80/w/index.xn--php?title=-14a:Test (second url above)

So, at some point it appears that IIS7.5 is rewriting the URL redirect.

Setting $wgDebugRedirects = true; circumvents this error by generating a link to the redirect page instead.

This may in fact be a IIS7.5 bug, but at this point Microsoft's stand is that this is an application error.

This issue was found and tested against IIS7.5 but may in fact have been introduced in IIS7.

This issue is browser independent; it has been recreated in IE8, IE7 and Firefox 3.6.
Comment 1 Lisa Ridley 2010-03-03 05:44:22 UTC
Update:

Using a URL redirect where the colon is encoded as well is accepted by IIS7.5 and redirects to the appropriate page within MediaWiki.

For example, changing wfUrlencode() so that the colon symbol is excluded from the call to str_ireplace() in wfUrlencode() will result in an encoded URL:

http://localhost/w/index.php?title=%C3%81%3A%C3%A1

which is accepted by IIS7.5 and redirected properly rather than generating a 404.0 error; the resulting MediaWiki page that is presented to the user has the correctly formatted title; however the parameters in the url are fully encoded.
Comment 2 Lisa Ridley 2010-03-03 08:36:40 UTC
A possible fix for this issue is as follows:

Change the global function wfUrlencode() to the following:

function wfUrlencode( $s ) {
    $s = urlencode( $s );
        $s = str_ireplace(
        array( '%3B','%3A','%40','%24','%21','%2A','%28','%29','%2C','%2F' ),
        array(   ';',  ':',  '@',  '$',  '!',  '*',  '(',  ')',  ',',  '/' ),
        $s
    );

    ## check to see if server is running Microsoft IIS 7 or greater; if so converts colons back to url encoded values
    if(isset($_SERVER['SERVER_SOFTWARE'])) {
        $server = explode("/", $_SERVER['SERVER_SOFTWARE']);
        if($server[0] == "Microsoft-IIS" && ($server[1]=='7'||$server[1]=='7.5')) {
            if(!(strpos($s, '%')===false)) {
                $s = str_ireplace(array(':'), array('%3A'), $s);
            }
        }
    }
    return $s;
}

I have tested this a preliminarily it appears to hae some promise.  I want to make sure that IIS always sets a value for $_SERVER['SERVER_SOFTWARE'] before creating a patch and/or modifying trunk.

Another option would be to exclude colons from the str_ireplace() process in wfUrlencode() for all browsers; however, this would result in an encoded character for the colon in every instance, which is probably not a desired behavior since this really only comes into play when there are special characters in the title prefix.
Comment 3 Lisa Ridley 2010-03-03 18:28:00 UTC
Revision to the fix above:

function wfUrlencode( $s ) {
	$s = urlencode( $s );
		$s = str_ireplace(
		array( '%3B','%3A','%40','%24','%21','%2A','%28','%29','%2C','%2F' ),
		array(   ';',  ':',  '@',  '$',  '!',  '*',  '(',  ')',  ',',  '/' ),
		$s
	);

	## check to see if server is running Microsoft IIS 7 or greater; if so converts colons back to url encoded values
	if(isset($_SERVER['SERVER_SOFTWARE'])) {
		$match = preg_match('|(Microsoft-IIS/7)|', $_SERVER['SERVER_SOFTWARE'], $a);
		if($match > 0) {
	            if(!(strpos($s, '%')===false)) {
	                $s = str_ireplace(array(':'), array('%3A'), $s);
			}
		}
	}
	return $s;
}
Comment 4 Platonides 2010-03-03 18:39:27 UTC
Does it also fail when the domain name has a dot?

From earlier reports it seems that IIS gets confused when the domain does not contain a dot (eg. localhost) and there's a colon in the url.
Comment 5 Lisa Ridley 2010-03-03 18:42:00 UTC
Yes.  I've tested this using http://localhost and using http://www.example.com (setting the domain name in the hosts file).

I don't have an internet accessible server with Windows so I haven't tested this on an actual web-accessible server, just in a local test environment.
Comment 6 Lisa Ridley 2010-03-03 21:14:22 UTC
Just got a report back from the user who initially reported this problem with IIS7; he has indicated that the fix posted above is working on his installation.  His wiki is on a company intranet so there is no publicly available link.
Comment 7 Platonides 2010-03-03 22:02:07 UTC
Fixed with a different implementation on r63228.
Comment 8 Lisa Ridley 2010-03-04 22:26:55 UTC
Microsoft has confirmed that this is a bug in IIS 7.5

http://forums.iis.net/p/1165517/1936321.aspx#1936321

I am reopening this bug.  The fix that was implemented in r63228 encodes all occurrences of the colon in the title parameter of a url.

This is not needed except in cases where there are certain special characters preceding the URL.  However since it appears to be picking the characters at random to apply what looks like punycoding to the URL, I am in favor of encoding the colon when there is any special character preceding the colon.  But encoding all of them may be a bit of overkill.
Comment 9 Platonides 2010-03-05 00:11:18 UTC
Does it only fail if the special characters are /before/ the colon?

Also, do you have any idea on what it considers 'special'? (we should probably go ASCII-7)
Comment 10 p858snake 2010-03-05 00:14:52 UTC
+upstream
this is a issue with iis according to that forum report, although we can still patch around it.
Comment 11 Lisa Ridley 2010-03-05 01:24:18 UTC
Yes.  It only fails if the special character falls before the colon.  URL parameter strings with no colon, or with special characters only after the colon appear to be fine.

This appears to occur primarily with lower case latin extended character, lower case cyrillic and some lower case greek characters (at least that's what I've identified so far).
Comment 12 Lisa Ridley 2010-03-05 01:24:39 UTC
Yes.  It only fails if the special character falls before the colon.  URL parameter strings with no colon, or with special characters only after the colon appear to be fine.

This appears to occur primarily with lower case latin extended character, lower case cyrillic and some lower case greek characters (at least that's what I've identified so far).
Comment 13 Bryan Tong Minh 2011-05-14 14:33:13 UTC
If I understand correctly, we only need to encode the colon if a special character is present? A quick fix would be /[A-Za-z0-9]*/.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links