Last modified: 2007-04-30 21:02:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T9629, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 7629 - IE6/IE7 user-agent strings involving "InfoPath" detect as non-Unicode-compliant
IE6/IE7 user-agent strings involving "InfoPath" detect as non-Unicode-compliant
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Page editing (Other open bugs)
1.10.x
PC Windows XP
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
http://<found-in-intranet-site-so-no-...
: patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-19 10:00 UTC by CyberCougar
Modified: 2007-04-30 21:02 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description CyberCougar 2006-10-19 10:00:09 UTC
When editing a page using Internet Explorer 6, Mediawiki says "Your browser is not unicode-compliant, 
blahblah".

I did some debugging and found:
In the file includes\DefaultSettins.php, 
several REGEXP patterns are defined on non-compliant browsers:

$wgBrowserBlackList = array(
        /**
         * Netscape 2-4 detection
         * The minor version may contain strings such as "Gold" or "SGoldC-SGI"
         * Lots of non-netscape user agents have "compatible", so it's useful to check for that
         * with a negative assertion. The [UIN] identifier specifies the level of security 
         * in a Netscape/Mozilla browser, checking for it rules out a number of fakers. 
         * The language string is unreliable, it is missing on NS4 Mac.
         * 
         * Reference: http://www.psychedelix.com/agents/index.shtml
         */
        '/^Mozilla\/2\.[^ ]+ .*?\((?!compatible).*; [UIN]/',
        '/^Mozilla\/3\.[^ ]+ .*?\((?!compatible).*; [UIN]/',
        '/^Mozilla\/4\.[^ ]+ .*?\((?!compatible).*; [UIN]/',  
#NOTE THIS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

        /**
         * MSIE on Mac OS 9 is teh sux0r, converts þ to <thorn>, ð to <eth>, Þ to <THORN> and Ð to <ETH>
         *
         * Known useragents:
         * - Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)
         * - Mozilla/4.0 (compatible; MSIE 5.15; Mac_PowerPC)
         * - Mozilla/4.0 (compatible; MSIE 5.23; Mac_PowerPC)
         * - [...]
         *
         * @link http://en.wikipedia.org/w/index. ... &oldid=12355864
         * @link http://en.wikipedia.org/wiki/Template%3AOS9
         */
        '/^Mozilla\/4\.0 \(compatible; MSIE \d+\.\d+; Mac_PowerPC\)/'
);

And in the file includes\EditPage.php, the current browser's USER-AGENT string is checked against the 
patterns:

        function checkUnicodeCompliantBrowser() {
                global $wgBrowserBlackList;
                if( empty( $_SERVER["HTTP_USER_AGENT"] ) ) {
                        // No User-Agent header sent? Trust it by default...
                        return true;
                }
                $currentbrowser = $_SERVER["HTTP_USER_AGENT"];
                foreach ( $wgBrowserBlackList as $browser ) {
                        if ( preg_match($browser, $currentbrowser) ) {
                                return false;
                        }
                }
                return true;
        }

Note the 3rd pattern, 
        '/^Mozilla\/4\.[^ ]+ .*?\((?!compatible).*; [UIN]/',  
it will match the IE6's USER-AGENT string on my machine, whick is shown below:
  Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; iebar; .NET CLR 1.1.4322; InfoPath.1)

Please note the "InfoPath.1" part near the end of the string, it is there because I installed InfoPath, 
a component of Microsoft Office 2003. The starting letter 'I' makes it matched with the 3rd pattern.
Comment 1 Brion Vibber 2006-10-19 17:25:14 UTC
IE 6.0 hasn't ever triggered this in the wild that we know of. Have you done something 
strange to customize your user-agent string? Can you confirm that it works properly 
when restored to normal?
Comment 2 Brion Vibber 2006-10-19 17:38:36 UTC
Ahh I see
Comment 3 CyberCougar 2006-10-24 03:33:39 UTC
After installing .NET framework 1.1 and MS Office 2003 (including InfoPath component), my IE's user-agent string has 
been changed like this:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; iebar; .NET CLR 1.1.4322; InfoPath.1)

InfoPath is a standard component of MS Office 2003 Pro, not a WEIRD PLUGIN.
Comment 4 Konstantin V. Bekreyev 2007-04-24 09:55:32 UTC
This regexp array doesn't recognized IE7 with this $USER_AGENT:

'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon; MRA 4.8 (build 
01705); InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)'

and MediaWiki show this message:

''WARNING: Your browser is not unicode compliant. A workaround is in place to 
allow you to safely edit articles: non-ASCII characters will appear in the edit 
box as hexadecimal codes.''

but this browser support UTF8!
Comment 5 Konstantin V. Bekreyev 2007-04-24 09:56:48 UTC
This regexp array doesn't recognized IE7 with this $USER_AGENT:

'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Maxthon; MRA 4.8 (build 
01705); InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)'

and MediaWiki show this message:

''WARNING: Your browser is not unicode compliant. A workaround is in place to 
allow you to safely edit articles: non-ASCII characters will appear in the edit 
box as hexadecimal codes.''

but this browser support UTF8!
Comment 6 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-04-24 16:22:43 UTC
It seems like the intent is to allow browsers that don't have "compatible"
there, but the regex doesn't work.  Should probably be

-	'/^Mozilla\/2\.[^ ]+ .*?\((?!compatible).*; [UIN]/',
-	'/^Mozilla\/3\.[^ ]+ .*?\((?!compatible).*; [UIN]/',
-	'/^Mozilla\/4\.[^ ]+ .*?\((?!compatible).*; [UIN]/',
+	'/^Mozilla\/2\.[^ ]+ [^(]*\((?!compatible).*; [UIN]/',
+	'/^Mozilla\/3\.[^ ]+ [^(]*\((?!compatible).*; [UIN]/',
+	'/^Mozilla\/4\.[^ ]+ [^(]*\((?!compatible).*; [UIN]/',

The .*? is screwed up by the nested parentheses (it eats the initial parenthesis
to avoid the prohibited "compatible" string).  Perl regex is all very nice, but
POSIX-style is better here.  Patch needs review.
Comment 7 Brion Vibber 2007-04-30 21:02:50 UTC
Tested the above modifications against actual referrer strings in our logs to
confirm.

Of 43003 MSIE samples, 3709 listed the InfoPath extension. 93 MSIE hits were
false-positive matches for the regexes, of which 77 listed the InfoPath extension.

In total, less than 0.12% of sampled hits were false positive matches -- 0.22%
of MSIE hits, 2.08% of InfoPath hits. (I did not sample edits specifically, but
all hits.)

Fixed in r21726.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links