Last modified: 2008-03-24 21:01:28 UTC
Page titles are not HTML-sanitized in various cases. Since page titles can have & in them, this leads to various minor bugs. I'd like to record all I can find here. 1) Neither [{{SERVER}}{{localurl:{{PAGENAME}}}}] nor [{{SERVER}}{{localurl:{{PAGENAMEE}}] works for the page entitled < 2) Special:Watchlist/edit appears to pass page titles literally. If you're watching [[<]], the list shows < with no link. 3) action=unwatch for [[<]] displays "The page "<" has been removed from your watchlist." The variable $1 has not been escaped in [[MediaWiki:Removedwatchtext]]. 4) The same goes for [[MediaWiki:Addedwatchtext]]. 5) And [[MediaWiki:Deletedtext]], parameter $1. 6) And [[MediaWiki:Undeletedtext]] 7) [[Special:Movepage/%26lt%3B]] displays "Move page: <", not "Move page: <" 8) Although the "Confirm delete" page correctly shows (Deleting "<), the "Confirm protection" page shows (Protecting "<"). (The same goes for "Confirm unprotection".) 9) [[Special:Log]] doesn't accept < in the Title field. See http://en.wikipedia.org/w/index.php?title=Special%3ALog&page=%26lt%3B 9.1) The same bug shows up when undeleting [[<]]. Instead of displaying the deletion log for the page, it displays the entire deletion log. 10) A new variable is needed, like {{PAGENAME}} and {{PAGENAMEE}}, that gives an HTML-sanitized version of the page title. At the moment, there's no good way to fix [[MediaWiki:Newarticletext]], as it displays for [[<]]. 11) You can't move a page to [[<]]. It claims that "The requested page title was invalid, empty, or an incorrectly linked inter-language or inter-wiki title." 12) I created [[User:Dbenbenn/<]], then deleted it. When I then edited the page, the "View or restore 2 deleted edits?" message didn't have the usual link to Special:Undelete.
13) If [[<]] redirects to another page, the redirect message is "(Redirected from <)" with no link. The variable in [[MediaWiki:Redirectedfrom]] isn't sanitized.
14) [[MediaWiki:Rclsub]] is another message where the parameter needs to be sanitized, with & converted to &
(In reply to comment #0) > 10) A new variable is needed, like {{PAGENAME}} and {{PAGENAMEE}}, that gives anHTML-sanitized version of the page title. At the moment, there's no good way tofix [[MediaWiki:Newarticletext]], as it displays for [[<]]. Perhaps {{PAGENAME}} should simply be updated. After all, from the point of view of HTML, "&lt;" actually ''is'' the title of the page whose plain-text title is <. There's no reason not to HTML-sanitize &, <, >, [, ], {, }, and | in the PAGENAME variable. As an aside, ' can also cause trouble. See http://en.wikipedia.org/w/index.php?title=Image:%27%27I_can_smell_%27em%21% 27%27.jpg&action=edit for a real-life example. The '' is interpreted as "begin italics", and should be escaped to ' instead.
15) You can't move [[<]] anywhere else. The error message is "Fatal error: Call to a member function on a non-object in /usr/local/apache/common-local/php-1.5/includes/SpecialMovepage.php on line 198"
Created attachment 839 [details] Patch to fix 3, 4, 5, 6, 7, and update Language.php slightly Here's a short patch to fix points 3, 4, 5, 6, and 7. (Incidentally, 8 and 14 got fixed independently.) The patch also changes the defaults for [[MediaWiki:Addedwatchtext]], [[MediaWiki:Removedwatchtext]], [[MediaWiki:Deletedtext]], and [[MediaWiki:Undeletedtext]]. I add <nowiki> tags around the page title parameter, so that the page title displays correctly in case it has '' in it.
Created attachment 840 [details] also fix 2 and 13 The patch also fixes 2 and 13, and adds <nowiki> to the default for [[MediaWiki:prefslogintext]] ([[User:''dbenbenn'']] is a valid user name!)
Created attachment 841 [details] Fix 10 and 12. 1, 9, 11, and 15 remain The patch now adds Sanitizer::wikiSanitize() to transform {{PAGENAME}}, fixing point 10. Note that now {{localurl:{{PAGENAME}}}} works for [[<]], but still not for [[*]]. (The wikiSanitize() function could probably be improved by a PHP expert.) The patch also modifies Skin.php to fix point 12.
Created attachment 842 [details] better fix for point 10, {{PAGENAME}} sanitization Here's a better implementation of Sanitizer:wikiSanitize(). It now makes the display work right if {{PAGENAME}} appears at the beginning of a line in a page called *, or '', or &, or ==foo==, or ;foo, or ----, or ~~~~ ({{subst:PAGENAME}} in this last case).
Created attachment 843 [details] better fix for point 10, also fix 11 and 15 wikiSanitize in the previous patch was broken, due to ;. Fixed now. Also, you can now move [[<]]. Furthermore, to move a page to [[<]], you used to have to enter "&lt;" in the "To new title" field---now "<" works as expected.
*** Bug 6986 has been marked as a duplicate of this bug. ***
This bug seems pretty broad. Maybe someone should consider making it a tracker and opening up separate bugs for the . . . separate bugs.
Category pages also suffer from this problem. Example: http://de.wikipedia.org/wiki/Kategorie:Schriftzeichen (there's a lone & on the page instead of &) Patch to fix it: --- orig/includes/Linker.php +++ mod/includes/Linker.php @@ -316,8 +316,9 @@ $u .= '#' . str_replace(array_keys($replacearray),array_values($replacearray),$anchor); } if ( $text == '' ) { - $text = htmlspecialchars( $nt->getPrefixedText() ); + $text = $nt->getPrefixedText(); } + $text = htmlspecialchars( $text ); if ( $style == '' ) { $style = $this->getInternalLinkAttributesObj( $nt, $text ); } @@ -385,8 +386,10 @@ $u = $nt->escapeLocalURL( $query ); if ( '' == $text ) { - $text = htmlspecialchars( $nt->getPrefixedText() ); + $text = $nt->getPrefixedText(); } + $text = htmlspecialchars( $text ); + $style = $this->getInternalLinkAttributesObj( $nt, $text, 'stub' ); list( $inside, $trail ) = Linker::splitTrail( $trail ); Patch against mediawiki 1.7.1. Please apply.
At least the last issue doesn't seem to appear in the newest MediaWiki. Someone should probably go through the other issues reported here and maybe open separate bugs for them...
* is also not appropriately escaped, e.g. http://en.wikipedia.org/wiki/*foo
(In reply to comment #14) > * is also not appropriately escaped, e.g. > http://en.wikipedia.org/wiki/*foo * is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it doesn't need to be escaped.
(In reply to comment #15) > (In reply to comment #14) > > * is also not appropriately escaped, e.g. > > http://en.wikipedia.org/wiki/*foo > > * is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it > doesn't need to be escaped. > Yeah, I realized that a little after I posted that. Should I post another bug for that?
(In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > * is also not appropriately escaped, e.g. > > > http://en.wikipedia.org/wiki/*foo > > > > * is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it > > doesn't need to be escaped. > > Yeah, I realized that a little after I posted that. > Should I post another bug for that? I'm not sure what you mean. The page validates correctly, so I don't see any bug there. Can you elaborate?
These cases seemed to have been cleaned up as of now.