Last modified: 2014-07-11 23:48:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T69196, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 67196 - PAGESINCATEGORY should decode HTML entities of input - if {{PAGENAME}} contains ' it will display 0


Summary:	PAGESINCATEGORY should decode HTML entities of input - if {{PAGENAME}} contai...

Status:	PATCH_TO_REVIEW

Product:	MediaWiki
Classification:	Unclassified
Component:	Parser (Other open bugs)
Version:	1.24rc
Hardware:	All All

Importance:	Normal normal (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:	https://www.mediawiki.org/wiki/Thread...
Whiteboard:
Keywords:

Depends on:	35746
Blocks:
	Show dependency tree / graph

Reported:	2014-06-27 13:57 UTC by Jesús Martínez Novo (Ciencia Al Poder)
Modified:	2014-07-11 23:48 UTC (History)
CC List:	2 users (show)

See Also:	35628
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Jesús Martínez Novo (Ciencia Al Poder) 2014-06-27 13:57:11 UTC

From the linked URL:

----

{{PAGESINCATEGORY:{{PAGENAME}}}} doesn't work if {{PAGENAME}} contains ' (it will display 0)

Its {{PAGENAME}} which is the problem, because it works if you change it with the title in clear text.

----

This can be tested for https://www.mediawiki.org/wiki/Category:Chris_G%27s_botclasses

Indeed, if I use {{subst:PAGENAME}} and hit "show changes", I see it's being substituted as "Chris G&#39;s botclasses".

I don't know why {{subst:PAGENAME}} is giving HTML encoded entities as output, but that's odd. Since fixing this may break things, PAGESINCATEGORY should check for HTML entities and decode them to check for pagename, just as it was done in bug 35628

Comment 1 Umherirrender 2014-06-27 15:25:43 UTC

The problem is not alone with pagesincategory, all other parser functions which takes a title have this problem, see also bug 16474.

Comment 2 Philippe Verdy 2014-06-27 21:53:39 UTC

There's a workaround which is to redecode the parameter of PAGESINCATEGORY with #titleparts.

But I'ms still convinced that we should not have to use this trick in wikicode, given that there should not exist any valid category name containing verbatic character entities (it is still possible that they exist, because we have allowed litteral ampersands in pagenames without requring them to be HTML-encoded with named entities, so this causees an ambiguity (but I'm not convinced that we have any valid page name containing verbatim named entities; and not that it's impossible to include verbatic sharp signs "#" so you cannot inclide verbatim numeric entities).

So what we could do is to HTML-encode quotes, ampersands, and lower-than/greater-than signs, by using numeric entities, instead of named entities (&quot; &apos; &lt; &gt;), so that they can safely be URL-decoded by PAGESINCATEGORIES (which would continue to treat named entities as verbatim without decoding them automatically like numeric entities.)

Comment 3 Gerrit Notification Bot 2014-07-11 23:48:29 UTC

Change 145724 had a related patch set uploaded by Brian Wolff:
Have Title::makeTitleSafe decode html entities.

https://gerrit.wikimedia.org/r/145724

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links