Last modified: 2009-08-23 00:36:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T22346, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 20346 - Bad link causing the whole page failed to parse
Bad link causing the whole page failed to parse
Status: RESOLVED DUPLICATE of bug 11143
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Normal major (vote)
: ---
Assigned To: Nobody - You can work on this!
http://zh.wikipedia.org/w/index.php?o...
: parser
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-08-22 01:55 UTC by Jimmy Xu
Modified: 2009-08-23 00:36 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Jimmy Xu 2009-08-22 01:55:33 UTC
There was a double-encoded link in the page, formed like [[%E5%8B%92%E5%86%85%C2%B7%E6%88%88%E8%A5%BF%E5%B0%BC#.E4.BD.9C.E5.93.81.E5.88.97.E8.A1.A8|作品列表]], and in one edit, a user accidentally inserted a space into the path, formed [[%E5%8B%92%E5%86%85%C2%B7%E6%88%88%E8 %A5%BF%E5%B0%BC#.E4.BD.9C.E5.93.81.E5.88.97.E8.A1.A8|作品列表]]. And this link cause the whole page failed to display, except the categories.

See http://zh.wikipedia.org/w/index.php?oldid=10944408&uselang=en and http://zh.wikipedia.org/w/index.php?oldid=10944786&uselang=en

But for a not-encoded link, such as [[勒内·戈西尼]], no matter how many spaces is inserted, there is no problem. So please check this out, thanks.

Best regards.
Comment 1 Jimmy Xu 2009-08-22 01:56:52 UTC
Additionally, comment

<!-- bodytext -->
<!-- NewPP limit report Preprocessor node count: 2286/1000000 Post-expand include size: 9740/2048000 bytes Template argument size: 4483/2048000 bytes Expensive parser function count: 4/500 -->

can be found where the article content should be placed.
Comment 2 Brion Vibber 2009-08-23 00:36:50 UTC
The space is inserted between bytes which make up a single UTF-8 character; the result is not a valid UTF-8 string.

In development trunk (as of r55512) the system seems to correctly reject the link; our current deployment isn't as careful and lets the bad string through, where it can eventually trigger behavior in the regular expression library which rejects the bad string and results in wiping out the whole article in parsing.

This issue is covered in bug 11143; duping the issue.

*** This bug has been marked as a duplicate of bug 11143 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links