Last modified: 2014-09-23 23:33:23 UTC
Some sites require that urls are encoded in iso-8859-1. To support generating valid url's from templates, a new function is required. The code to generate this would look like: return urlencode( utf8_decode ( $s ));
URL's to the external website ''finn.no'' and other requires that parameters are encoded in iso-8859-1, not UTF-8. A link to a map showing the city ''Ålesund'' might be http://www.finn.no/finn/map?mapTitle=%C5lesund,+%C5LESUND,+By. Note the Å is encoded as %C5, not %C3%85 as urlencode will return. A new colon function, for example urlencode_iso8859-1, that encodes the parameters as iso-8859-1 will solve this problem. See also http://no.wikipedia.org/wiki/Wikipedia:Tegnsett
Created attachment 2408 [details] Patch to enable urlencodeiso
There's no way MediaWiki can ever account for every oddball URL encoding method without implementing full-fledged string functions: just the other day I encountered a site that somehow double-urlencoded its URLs, so there was a "25" just before every byte's hex representation. Heck, MediaWiki itself uses an unusual encoding, converting spaces to _ instead of +. What about ISO 8859-2 through -15, for instance?
ISO 8859-1 is "is the basis of two widely-used character maps known as ISO-8859-1 (note the extra hyphen) and Windows-1252. In June 2004, the ISO/IEC working group responsible for maintaining eight-bit coded character sets disbanded and ceased all maintenance of ISO 8859, including ISO 8859-1, in order to concentrate on the Universal Character Set and Unicode. In computing applications, encodings that provide full UCS support (such as UTF-8 and UTF-16) are finding increasing favor over encodings based on ISO 8859-1." from [[:en:/ISO/IEC_8859-1]]. So ISO-8859-1 is highly more relevant than other kinds of encodings? ~~~~
So a patch is rejected if it only solves one problem and leaves other problems unsolved?
Nobody has rejected your patch; otherwise this would be marked WONTFIX. I'm not even a developer. I'm just giving my opinion: we probably don't want 352 colon functions. That sounds suspiciously like feature creep. If any developer disagrees, which some may well, they'll commit the patch. There's no point discussing it here further either way.
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
+need-review to signal to developers that this patch needs reviewing
Hi Bård, thank you for the patch! As you may already know, MediaWiki is currently revamping its PHP-based parser into a "Parsoid" prototype component, to support the rich-text Visual Editor project: https://www.mediawiki.org/wiki/Parsoid https://www.mediawiki.org/wiki/Visual_editor Folks interested in enhancing the parser's capabilities are very much welcome to join the Parsoid project, and contribute patches as Git branches: https://www.mediawiki.org/wiki/Git/Tutorial#How_to_submit_a_patch Compared to .diff attachments in Bugzilla tickets, Git branches are much easier for us to review, refine and merge features together. Each change set has a distinct URL generated by the "git review" tool, which can be referenced in Bugzilla by pasting its gerrit.wikimedia.org URL as a comment. If you run into any issues with the patch process, please feel free to ask on irc.freenode.net #wikimedia-dev and the wikitext-l mailing list. Thank you!