Last modified: 2011-06-20 16:41:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T15288, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 13288 - urlencode on variables get double-encoded


Summary:	urlencode on variables get double-encoded

Status:	NEW

Product:	MediaWiki
Classification:	Unclassified
Component:	Parser (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Low normal (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:	http://en.wikipedia.org/wiki/User:Ser...
Whiteboard:
Keywords:

Duplicates:	22508 (view as bug list)
Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2008-03-07 22:56 UTC by Sergey Chernyshev
Modified:	2011-06-20 16:41 UTC (History)
CC List:	5 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Sergey Chernyshev 2008-03-07 22:56:03 UTC

I use {{urlencode}} to encode {{PAGENAME}} value and it looks like it double encodes them.

I created a test page for it on Wikipedia and it has the same issue:
http://en.wikipedia.org/wiki/User:Sergey_Chernyshev/Variable_Urlencode_%27_bug

Comment 1 Nicolas Dumazet 2008-03-16 16:31:15 UTC

Variables are being escaped through wfEscapeWikiText, so ' is converted to &#39;
Then &, #, and ; from "&#39;" are escaped by urlencode()

While wfEscapeWikiText sucks, a simple fix for now would be to html_entity_decode the text before any {{urlencode: processing : html entities in URLs are invalid anyway (&quot; is a bad title, and &#nn; is interpreted by navigators as & )

Index: CoreParserFunctions.php
===================================================================
--- CoreParserFunctions.php	(révision 32034)
+++ CoreParserFunctions.php	(copie de travail)
@@ -82,7 +82,7 @@
 	}
 
 	static function urlencode( $parser, $s = '' ) {
-		return urlencode( $s );
+		return urlencode( html_entity_decode($s, ENT_QUOTES) );
 	}
 
 	static function lcfirst( $parser, $s = '' ) {

Comment 2 Alexandre Emsenhuber [IAlex] 2010-02-13 15:31:18 UTC

*** Bug 22508 has been marked as a duplicate of this bug. ***

Comment 3 Niklas Laxström 2010-02-13 16:00:26 UTC

Isn't {{PAGENAMEE}} just for this purpose?

Comment 4 Platonides 2010-02-13 16:03:34 UTC

Yes, {{PAGENAMEE}} is a valid workaround. But it should work, nonetheless.

Comment 5 Niklas Laxström 2010-02-13 16:09:31 UTC

I see no way how it could possibly work without breaking BC.

Comment 6 Conrad Irwin 2010-02-13 16:28:28 UTC

{{PAGENAMEE:{{PAGENAME:&}}}} -> %26        (RIGHT) 
{{URLENCODE:{{PAGENAME:&}}}} -> %26amp%31  (WRONG)
{{PAGENAMEE:&amp;}}    -> %26              (WRONG - ?)
{{URLENCODE:&amp;}}    -> %26amp%31        (RIGHT)

I put the ? there because [[&amp;]] creates a link to [[&]] (perhaps also wrong) and http://en.wikipedia.org/wiki/%26amp; is an server error.


I think the solution would be to have {{PAGENAME}} et.al. return a "text-needs-escape" object of some kind, parser functions could then request that they get unescaped input as a flag, the parser would then escape the text when the escaping is neeeded.

The Django template engine deals with this issue very nicely, maybe we can copy some of their ideas.

Comment 7 Niklas Laxström 2010-02-13 16:35:34 UTC

(In reply to comment #6)
> {{PAGENAMEE:&amp;}}    -> %26              (WRONG - ?)
> 
> I put the ? there because [[&amp;]] creates a link to [[&]] (perhaps also
> wrong) and http://en.wikipedia.org/wiki/%26amp; is an server error.

&amp; is disabled on wmf due to broken clients. Also, entities in titles are normalised away unless I am mistaken.

I don't know enough about parser to say if that is possible.

Comment 8 Amalthea 2011-06-20 16:41:01 UTC

Core of the issue seems to be that {{PAGENAME}} and others internally escapes some characters to entities, which breaks other magic words/parser functions when they are using it directly.


{{#ifeq:{{PAGENAME:File:Aci Sant'Antonio.svg}}|Aci Sant'Antonio.svg|y|n}}
 → "n"

{{#ifeq:{{PAGENAME:File:Aci Sant'Antonio.svg}}|Aci Sant&#39;Antonio.svg|y|n}}
 → "y"

{{FILEPATH:Aci_Sant'Antonio.svg}}
 → "http://upload.wikimedia.org/wikipedia/commons/0/00/Aci_Sant%27Antonio.svg"

{{FILEPATH:Aci Sant&#39;Antonio.svg}}
 → ""

{{str left|{{PAGENAME:File:Aci Sant'Antonio.svg}}|12}}
 → "Aci Sant&#39"

Comment 9 Amalthea 2011-06-20 16:41:44 UTC

More or less duplicated by bug 16474 and bug 14779, as far as I can tell.

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links