Last modified: 2009-11-17 22:43:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T7497, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 5497 - Regression in HTML normalization in 1.6 (unclosed <li>)
Regression in HTML normalization in 1.6 (unclosed <li>)
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.7.x
All All
: Normal normal (vote)
: ---
Assigned To: Brion Vibber
http://mail.wikipedia.org/pipermail/m...
: testme
: 4373 5977 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-04-07 20:04 UTC by Brion Vibber
Modified: 2009-11-17 22:43 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
New removeHtmlTags function which is mroe accurate and complete (15.98 KB, patch)
2006-06-03 08:13 UTC, Brion Vibber
Details
With fix for crash on </body> (16.37 KB, patch)
2006-06-03 09:52 UTC, Brion Vibber
Details

Description Brion Vibber 2006-04-07 20:04:44 UTC
Pre-XHTML HTML allows a number of elements to be implicitly closed,
such as <li> within a <ul> or <ol>.

In 1.5, MediaWiki allowed this code through:

<ul>
<li>One
<li>Two
</ul>

While invalid XHTML due to not being well-formed, this did render
in browsers the way people would expect from traditional HTML.

In 1.6 and current trunk, this is smashed up in output as:

<ul>
<li>One
&lt;li&gt;Two
&lt;/ul&gt;

</li>
</ul>

The second <li> and the closing </ul> are rejected and escaped, then closing
</li> and </ul> get added at the end of the document.

Proper application of nesting rules *should* normalize it to
something like this:

<ul>
<li>One
</li><li>Two
</li></ul>
Comment 1 lɛʁi לערי ריינהארט 2006-04-08 18:48:41 UTC
*note* because of question at
http://mail.wikipedia.org/pipermail/mediawiki-l/2006-April/011228.html

(Are <ul> and <li> even allowed on a default install?)

This bug can not be reproduced at Wikimedia Foundation wikies:
http://en.wikipedia.org/wiki/User:Gangleri/tests/bugzilla/05497
http://test.wikipedia.org/wiki/User:Gangleri/tests/bugzilla/05497

It can be seen at
http://test.leuksman.com/view/User:Gangleri/tests/bugzilla/05497

best regards reinhardt [[user:gangleri]]
Comment 2 Antoine "hashar" Musso (WMF) 2006-04-24 19:30:11 UTC
Fixed in trunk and REL1_6. We forgot to handle $htmlsingle
in Sanitizer.php :(
Comment 3 Austin Che 2006-05-02 21:59:37 UTC
I'm not sure if this is the result of fixing this bug or a completely separate bug, but there's 
another regression on both the latest 1.6 and trunk with either of:
<ul>
<li>One</li>
</ul>
or
<table>
<tr><td>blah</td></tr>
</table>
The closing tags are shown (e.g. as &lt;li) in the browser.
Comment 4 Brion Vibber 2006-05-02 23:36:24 UTC
I'm reverting the bogus patch from REL1_6.

A parser change like this absolutely should NOT go into a release branch 
without strict testing. There's not even a parser test case to go with it!
Comment 5 Brion Vibber 2006-05-03 00:26:33 UTC
I've added 8 parser test cases. Currently only 1 passes, both in REL1_6
and trunk (which has the above and another patch hacked in).

At the moment trunk "looks" better for some cases but produces invalid 
output, as 1.5 did.
Comment 6 sspecter 2006-05-04 23:11:13 UTC
I have a related problem (i believe it is related) with list parsing. Im using
wiki 1.6.3 but I believe it must be on 1.7. Sorry but i cant test 1.7 with
extensions here.

I have an extension to create MathML tags. It is called <asciimath>. when I do:

* item <asciimath>x</asciimath>

I get:

<ul><li>item <math ...(mathML)...></li> ... (math commands)... <math>

Or, for simplification:

...<li><math></li>...</math>...

As I need XHTML to get MathML working, my wiki page crash and burn.

Solution proposed:

1- it appears <li> closes after the 1st tag he encounter. it should at least
close before. It is not a very good solution.

2- <li> must allow tags inside him, SPECIALLY tags from extensions. It should
only look for:
  - </ul> of the level hes in, not the children lists, or 
  - <li>
...and place before it. Or, if he dont find it, destroy itself (by putting
<li></li> or removing the <li> unclosed.
Comment 7 Antoine "hashar" Musso (WMF) 2006-05-13 19:34:02 UTC
Severa
Comment 8 Brion Vibber 2006-05-16 07:53:25 UTC
*** Bug 5977 has been marked as a duplicate of this bug. ***
Comment 9 Dov Grobgeld 2006-05-16 08:09:40 UTC
My bug report 5977 was marked a duplicate of this bug, which might be. But note
that even if the <li> tags are closed is the rendering not correct:

  <ol>
    <li> Something
      <ol>
        <li> Deeper</li>
      </ol>
    </li>
    <li> Something else</li>
  </ol>

This is rendered as follows:

<p><ol>

</p>
  <li> Something
    <ol>
      &lt;li&gt; Deeper&lt;/li&gt;
    </ol>
  </li>
  <li> Something else</li>

<p></ol>
</p>

Comment 10 Brion Vibber 2006-06-03 08:13:45 UTC
Created attachment 1885 [details]
New removeHtmlTags function which is mroe accurate and complete

Handles implied end tags properly, and obeys more nesting prohibitions.
Want to test this a little more before applying to trunk.
Comment 11 Brion Vibber 2006-06-03 09:52:47 UTC
Created attachment 1891 [details]
With fix for crash on </body>
Comment 12 Brion Vibber 2006-06-04 01:05:04 UTC
*** Bug 4373 has been marked as a duplicate of this bug. ***
Comment 13 Aaron Axelsen 2006-08-14 19:47:19 UTC
This appears to still be an issues in 1.7.1 with definition lists.  Any ETA on a
permanent fix?
Comment 14 Mark A. Hershberger 2009-11-17 22:43:36 UTC
This appears to have been fixed long ago.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links