Last modified: 2012-12-31 17:16:51 UTC
MediaWiki supports ISO 8859-1 and UTF-8 URL encoding: ISO 8859-1: http://de.wikipedia.org/w/index.php?title=%D6sterreich&action=history UTF-8: http://de.wikipedia.org/w/index.php?title=%C3%96sterreich&action=history On ISO 8859-1 URL encoding mw.util.getParamValue('title') stops on decodeURIComponent('%D6sterreich') with URIError: malformed URI sequence Possible solutions: * mw.util.getParamValue() tries to decode as ISO 8859-1 when UTF-8 decoding fails. * MediaWiki answers with 301 Moved Permanently and Location in UTF-8 encoding.
There are several possible fallback decodings depending on the configured site language; replicating that in the JS code would be some annoying extra effort. Redirecting on a GET request would not be untoward, though we could also see it on some POST requests, where that's less reliable... in general though it's a bit flaky to be manually trying to read things out of the query string that JS sees, as various things might not be there *at all* -- if they come in via rewrite rules, PATH_INFO (eg /wiki/Foo) or POST data. Your example URL is a good example of one that could easily be switched to using a different URL style that would not have a 'title' query string parameter at all; if $wgActionPaths is set up for history you might have eg http://de.wikipedia.org/history/%C3%96sterreich. If you're *on* that page, you should be getting the title via mw.config.get('wgTitle') and should make no assumptions about query string parameters. If you're somewhere else and trying to decode some foreign URL that's been provided to you, then it's probably a bit tricky to try to make any claims about it.
Of course mw.util.getParamValue('title') can be substituted by mw.config.get('wgTitle') or mw.config.get('wgPageName'). On http://de.wikipedia.org/w/index.php?title=Spezial%3ABeitr%E4ge&contribs=user&target=%D6sterreicher there is no suitable substitute for mw.util.getParamValue('target').
I don't think mw.util.getParamValue('target') will help you at http://de.wikipedia.org/w/index.php?title=Spezial%3ABeitr%E4ge/%D6sterreicher or http://de.wikipedia.org/wiki/Spezial%3ABeitr%E4ge/%D6sterreicher however. Possibly indicates that special pages should be exporting their parameters in a cleaner way.
Yes, at the moment there are a lots of hacks to extract the parameters from the URL like extractLemma() in https://de.wikipedia.org/wiki/Benutzer:PDD/helperFunctions.js. mw.util.getParamValue() does not solve the problems and has the same problem with ISO 8859-1 encoding. Maybe it is possible allocate a JavaScript object with the normalized URL parameters for each (special) page.
Adding bug #37378 in see also.
(In reply to comment #4) > Yes, at the moment there are a lots of hacks to extract the parameters from the > URL like extractLemma() in > https://de.wikipedia.org/wiki/Benutzer:PDD/helperFunctions.js. > mw.util.getParamValue() does not solve the problems and has the same problem > with ISO 8859-1 encoding. > > Maybe it is possible allocate a JavaScript object with the normalized URL > parameters for each (special) page. +1. I've seen similar problems on Portuguese Wikipedia.
(In reply to comment #0) > On ISO 8859-1 URL encoding > mw.util.getParamValue('title') > stops on decodeURIComponent('%D6sterreich') with > URIError: malformed URI sequence BTW: Any chance of this being related to bug 25846?
I see three possibilities to solve the problem * Implement in JavaScript the same decoder like in PHP. So mw.util.getParamValue() tries to decode as ISO 8859-1 when UTF-8 decoding fails. * MediaWiki answers with 301 Moved Permanently and Location in UTF-8 encoding. So the URL gets normalized. * The parameters gets normalized in PHP and transfered to JavaScript via wiki global variable. mw.util.getParamValue() use the normalized parameters from this variable instead of the URL.
(In reply to comment #8) > I see three possibilities to solve the problem > > * Implement in JavaScript the same decoder like in PHP. So > mw.util.getParamValue() tries to decode as ISO 8859-1 when UTF-8 decoding > fails. There are more encodings, and this can depends on the configuration of the server as well. > * MediaWiki answers with 301 Moved Permanently and Location in UTF-8 encoding. > So the URL gets normalized. Not desired or possible in certain cases. > * The parameters gets normalized in PHP and transfered to JavaScript via wiki > global variable. mw.util.getParamValue() use the normalized parameters from > this variable instead of the URL. Unnecessary bloat. And all these 3 solutions share the same problem: They encourage usage of query created and targeted for one script – outside that script (namely a gadget or something). Which is bad, because these are not documented or considered stable. Query parameter names and meaning may change at any time, and must only be used for communication between the script's output and input to itself. Solution 4.: Relevant values are exported to javascript in a canonical way by the script. For example MediaWiki itself always exports wgTitle, wgNamespaceNumber etc. And special pages export wgCanonicalSpecialPageName. Other scripts can export their own information (e.g. SpecialContributions could export `spContributionsTarget: ..` or `spContributions: { target: .. }`.).
Solution 4 is good. This means that mw.util.getParamValue() should never used to get parameters from the URL. All parameters must be parsed on server side. When all parameters are available for JavaScript then this bug can closed with WONTFIX.