Last modified: 2014-11-17 10:34:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T27934, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 25934 - Optionally enable urldecode for external links
Optionally enable urldecode for external links
Status: NEW
Product: MediaWiki
Classification: Unclassified
Internationalization (Other open bugs)
1.17.x
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n, patch, patch-reviewed
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-11-15 08:13 UTC by Dmitriy Sintsov
Modified: 2014-11-17 10:34 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Optionally enable urldecode for non-ASCII external links (1.20 KB, patch)
2010-11-15 08:13 UTC, Dmitriy Sintsov
Details
urldecode of text for local utf8 characters, just as in major browser's address line (1.90 KB, patch)
2010-11-17 19:21 UTC, Dmitriy Sintsov
Details
Try to urldecode external links similar to the way Firefox does (2.65 KB, patch)
2011-02-24 11:13 UTC, Dmitriy Sintsov
Details

Description Dmitriy Sintsov 2010-11-15 08:13:17 UTC
Created attachment 7818 [details]
Optionally enable urldecode for non-ASCII external links

At a some wiki site I add external links to another wikis (primarily in Russian). Such wiki have Cyrillic Titles, which links contain utf8 entities that are encoded (%xx). One may add such wiki to interwiki table to decode Title names. However, adding of every wiki is a bit tiresome. Browsers, like Firefox, already properly decode these URLs in their address line. I suggest to perform urldecode for such links. This way, Cyrillic external links become readable in generated html page. One may introduce a new $wgExternalLinksDecode, if such vehavior is undesired by default.
Comment 1 Bawolff (Brian Wolff) 2010-11-15 12:50:38 UTC
Just to clarify, this is for decoding the text part of the link, not the url in the href?

The idea itself sounds sane if the user just writes a url in the wiki (at least imho). However i don't think we'd want to url-decode something like:

[http://example.com some text for the link with %25 in it]

If the user specified the text for the link, we should assume they know what they are doing and not decode it. (your patch would decode both).
Comment 2 Dmitriy Sintsov 2010-11-15 16:08:15 UTC
Firefox is performing entities decode for URL in address line. For example, try to place the following link to wiki page (even without the patch), then open it and look at address line:
http://ru.wikipedia.org/wiki/%D0%94%D1%80%D0%BE%D1%84%D0%B0
Series of hex codes %xx were replaced with Cyrillic chars, which are readable to anyone who knows Cyrillic alphabet.

However, when you copy/paste such URL to text editor back from address line, %xx will reappear again - so internally that is the same binary representation, the decoding is preformed only for visualization.

Opera and Safari probably does this, too. IE8 - does not, haven't checked IE9, yet.

[http://ru.wikipedia.org/wiki/%D0%94%D1%80%D0%BE%D1%84%D0%B0 %D0%94%D1%80%D0%BE%D1%84%D0%B0], the description is not decoded with my patch, only URL. Quite opposite, however probably matches Firefox logic.
Comment 3 Derk-Jan Hartman 2010-11-15 20:07:17 UTC
I'm in favor of this idea, and the implementation is only for links in wikicode without text. I think that with all the international versions we have, this would be a welcome change for many.
Comment 4 Bawolff (Brian Wolff) 2010-11-16 01:47:04 UTC
Whoops. You're right this doesn't affect [http://example.com %E0%B4%B3%E0%B5%8D%E2%80%8D] style links since the parser sets the $escape argument for false for those kind of links. However it still seems a bit weird, as if I do something like $sk->makeExternalLink( "http://example.com/some_url", "some text that is not a url, entered by the user, containing a %HH code" ); in an extension, the result would probably not be what is expected.
Comment 5 Dmitriy Sintsov 2010-11-17 19:21:50 UTC
Created attachment 7827 [details]
urldecode of text for local utf8 characters, just as in major browser's address line

improves the readability of such links a lot.
Comment 6 Dmitriy Sintsov 2010-11-17 19:22:03 UTC
New patch which should have better compatibility to the existing Linker / Skin usage.
Comment 7 Brion Vibber 2011-02-12 06:26:31 UTC
Stumbled on this in bugzilla... I like the basic idea of the patch, but there's a couple of issues which'll need to be worked out.

First, not all URLs with encoded characters are encoded in UTF-8... while we like to hope that most of them are in this day and age, there's no guarantee. Russian, Japanese, Chinese, etc sites may still use other national encodings, especially on older links...

Reasonable behavior would at least need to check for UTF-8 validity to avoid outputting garbage characters.


Second, there are lots of meaningful characters in URLs where the difference between being encoded and not actually changes the URL; for instance Firefox will show

http://en.wikipedia.org/wiki/What%27s_Eating_Gilbert_Grape%3F

as:

http://en.wikipedia.org/wiki/What's_Eating_Gilbert_Grape%3F

and not as:

http://en.wikipedia.org/wiki/What%27s_Eating_Gilbert_Grape?
which would actually point to "[[What's Eating Gilbert Grape]]" with an empty query string on the end if you copy-pasted the text.


Third, if the above are resolved, I'd probably be pretty happy with *not* adding the parameter on makeExternalLink() -- it's probably sane behavior for the automatic formatting of the bare link display in pretty much all cases.
Comment 8 Dmitriy Sintsov 2011-02-24 11:13:25 UTC
Created attachment 8203 [details]
Try to urldecode external links similar to the way Firefox does

Useful to improve readability of local UTF-8 encoded external links. Made with 1.17 branch.
Comment 9 Sumana Harihareswara 2011-11-08 14:03:14 UTC
Dmitriy, thank you for the patch.  I'm sorry it has taken so long for us to respond to you!

Because it's been so long, when I tried to apply your patch, it didn't apply cleanly to trunk.  But before you try to revise it so it applies, will you come into the #mediawiki channel on FreeNode IRC and ask for a more thorough review of your patch?  That way you won't waste time redoing work.  Thanks!
Comment 10 Bartosz Dziewoński 2012-09-28 18:17:13 UTC
Removing bug 27292 as blocker, this has nothing to do with skins. (I don't think there's a "Parser" tracking bug, at least I didn't find one.)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links