Last modified: 2014-09-23 23:09:33 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T28441, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 26441 - quoted and unquoted attributes are not handled the same in Sanitizer.php
quoted and unquoted attributes are not handled the same in Sanitizer.php
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.18.x
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://test.wikipedia.org/wiki/MW_ATT...
: need-parsertest
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-26 21:22 UTC by Umherirrender
Modified: 2014-09-23 23:09 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
change the attribs regex to handle the quoted and unquoted attributes same (594 bytes, patch)
2010-12-26 21:22 UTC, Umherirrender
Details

Description Umherirrender 2010-12-26 21:22:22 UTC
Created attachment 7932 [details]
change the attribs regex to handle the quoted and unquoted attributes same

The regex MW_ATTRIBS_REGEX in Sanitizer.php should change for some cases, because the quoted and unquoted attributes are not handled the same way:

* attributes with empty content
* attributes with < as content

I have create a wikipage linked as url to make that visible.

Thanks.
Comment 1 Bawolff (Brian Wolff) 2010-12-27 02:17:49 UTC
presumably the not reading of < in attributes of tag extensions is some sort of paranoia against XSS. It would perhaps make sense to make it not recognize <tag att=foo<bar > for consistency's sake.

not recognizing <tag someAttribute= > Seems sane to me. I expect to be required to do <tag someAttribute=""> if i want to pass it the empty string.
Comment 2 Mark A. Hershberger 2011-01-29 19:16:07 UTC
Not reading < in attributes is b/c that is how the spec is written, IIRC.  I don't think <> are allowed in attributes.
Comment 3 Krinkle 2011-01-29 19:35:45 UTC
<elem attr=> is a bit weird, not sure if that should be supported.

However <elem attr> should render as <elem attr=""> imho
Comment 4 Bawolff (Brian Wolff) 2011-01-29 20:04:36 UTC
I thought in html, <elem attr> was equivelent to <elem attr="attr">. It would be weird to do the opposite of html imho.
Comment 5 Mark A. Hershberger 2011-01-29 20:35:30 UTC
(In reply to comment #4)
> I thought in html, <elem attr> was equivelent to <elem attr="attr">.

I don't think so. But see http://www.w3.org/TR/html-markup/syntax.html#syntax-attributes for more info.
Comment 6 Bawolff (Brian Wolff) 2011-01-29 21:21:01 UTC
Hmm, maybe its an xhtml thing. I was reading http://www.w3.org/TR/xhtml1/#h-4.5 html is confusing.
Comment 7 Brion Vibber 2011-02-14 00:10:06 UTC
Need to add test cases for the behavior that it's supposed to be changing, and clarify what is supposed to change and why.

Patch seems to be forbidding quoted empty elements, which is definitely wrong.....?

Appears to also remove '<' and '>' from the list of accepted chars for unquoted attribs. Not sure how those chars actually interact with the rest of the sanitizer stuff, but note the HTML 5 parser rules explicitly specify that '>' should close out the tag, while '<' is technically bogus but should be treated as part of the attribute value for consistent fallback behavior:

http://dev.w3.org/html5/spec/Overview.html#attribute-value-unquoted-state
Comment 8 Sumana Harihareswara 2011-11-09 19:09:33 UTC
Umherirrender, I am adding the "reviewed" keyword to this bug since you received a review from Brion in comment 7 in February.  Do you have time and interest in revising the patch in accordance with those suggestions?  Thanks for the patch!
Comment 9 Umherirrender 2011-11-14 12:34:42 UTC
Comment on attachment 7932 [details]
change the attribs regex to handle the quoted and unquoted attributes same

I am not able to provide a new patch with parser tests or which have the right regex for the specifition.

Marking patch as obsolete.
Comment 10 C. Scott Ananian 2013-07-30 19:05:10 UTC
Echoing comment 7: parser tests need to be added, so that we can ensure that the PHP parser and Parsoid have the same behavior.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links