Last modified: 2011-06-20 16:41:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 13288 - urlencode on variables get double-encoded
urlencode on variables get double-encoded
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/wiki/User:Ser...
:
: 22508 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-07 22:56 UTC by Sergey Chernyshev
Modified: 2011-06-20 16:41 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sergey Chernyshev 2008-03-07 22:56:03 UTC
I use {{urlencode}} to encode {{PAGENAME}} value and it looks like it double encodes them.

I created a test page for it on Wikipedia and it has the same issue:
http://en.wikipedia.org/wiki/User:Sergey_Chernyshev/Variable_Urlencode_%27_bug
Comment 1 Nicolas Dumazet 2008-03-16 16:31:15 UTC
Variables are being escaped through wfEscapeWikiText, so ' is converted to '
Then &, #, and ; from "'" are escaped by urlencode()

While wfEscapeWikiText sucks, a simple fix for now would be to html_entity_decode the text before any {{urlencode: processing : html entities in URLs are invalid anyway (" is a bad title, and &#nn; is interpreted by navigators as & )

Index: CoreParserFunctions.php
===================================================================
--- CoreParserFunctions.php	(révision 32034)
+++ CoreParserFunctions.php	(copie de travail)
@@ -82,7 +82,7 @@
 	}
 
 	static function urlencode( $parser, $s = '' ) {
-		return urlencode( $s );
+		return urlencode( html_entity_decode($s, ENT_QUOTES) );
 	}
 
 	static function lcfirst( $parser, $s = '' ) {
Comment 2 Alexandre Emsenhuber [IAlex] 2010-02-13 15:31:18 UTC
*** Bug 22508 has been marked as a duplicate of this bug. ***
Comment 3 Niklas Laxström 2010-02-13 16:00:26 UTC
Isn't {{PAGENAMEE}} just for this purpose?
Comment 4 Platonides 2010-02-13 16:03:34 UTC
Yes, {{PAGENAMEE}} is a valid workaround. But it should work, nonetheless.
Comment 5 Niklas Laxström 2010-02-13 16:09:31 UTC
I see no way how it could possibly work without breaking BC.
Comment 6 Conrad Irwin 2010-02-13 16:28:28 UTC
{{PAGENAMEE:{{PAGENAME:&}}}} -> %26        (RIGHT) 
{{URLENCODE:{{PAGENAME:&}}}} -> %26amp%31  (WRONG)
{{PAGENAMEE:&}}    -> %26              (WRONG - ?)
{{URLENCODE:&}}    -> %26amp%31        (RIGHT)

I put the ? there because [[&]] creates a link to [[&]] (perhaps also wrong) and http://en.wikipedia.org/wiki/%26amp; is an server error.


I think the solution would be to have {{PAGENAME}} et.al. return a "text-needs-escape" object of some kind, parser functions could then request that they get unescaped input as a flag, the parser would then escape the text when the escaping is neeeded.

The Django template engine deals with this issue very nicely, maybe we can copy some of their ideas.
Comment 7 Niklas Laxström 2010-02-13 16:35:34 UTC
(In reply to comment #6)
> {{PAGENAMEE:&}}    -> %26              (WRONG - ?)
> 
> I put the ? there because [[&]] creates a link to [[&]] (perhaps also
> wrong) and http://en.wikipedia.org/wiki/%26amp; is an server error.

& is disabled on wmf due to broken clients. Also, entities in titles are normalised away unless I am mistaken.

I don't know enough about parser to say if that is possible.
Comment 8 Amalthea 2011-06-20 16:41:01 UTC
Core of the issue seems to be that {{PAGENAME}} and others internally escapes some characters to entities, which breaks other magic words/parser functions when they are using it directly.


{{#ifeq:{{PAGENAME:File:Aci Sant'Antonio.svg}}|Aci Sant'Antonio.svg|y|n}}
 → "n"

{{#ifeq:{{PAGENAME:File:Aci Sant'Antonio.svg}}|Aci Sant'Antonio.svg|y|n}}
 → "y"

{{FILEPATH:Aci_Sant'Antonio.svg}}
 → "http://upload.wikimedia.org/wikipedia/commons/0/00/Aci_Sant%27Antonio.svg"

{{FILEPATH:Aci Sant'Antonio.svg}}
 → ""

{{str left|{{PAGENAME:File:Aci Sant'Antonio.svg}}|12}}
 → "Aci Sant&#39"
Comment 9 Amalthea 2011-06-20 16:41:44 UTC
More or less duplicated by bug 16474 and bug 14779, as far as I can tell.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links