Last modified: 2010-05-15 15:59:39 UTC
Created attachment 4366 [details] Wiki markup for a page that shows html validation failure. The presence of a TOC in combination with headings in a particular sequence of levels causes w3c html validation to fail. This happens both on a live 1.11.0 wiki and on a freshly installed vanilla copy of mediawiki 1.11.0. <example> __TOC__ == First heading == === subheading === == Second heading == </example> This gives an HTML validation error. If you remove the Second heading, the page validates OK. If you remove the subheading instead, the page validates OK. If you remove the __TOC__, or replace it with __NOTOC__ in articles with more headings, the page validates OK. The attached file contains wiki markup of the above example ready for pasting into a new page.
HTML output (indented here for better readability): <table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div> <ul> <li class="toclevel-1"> <a href="#First_heading"><span class="tocnumber">1</span> <span class="toctext">First heading</span></a> <ul> <li class="toclevel-2"> <a href="#subheading"><span class="tocnumber">1.1</span> <span class="toctext">subheading</span></a> </li> </ul> </li> <li class="toclevel-1"> <a href="#Second_heading"><span class="tocnumber">2</span> <span class="toctext">Second heading</span></a> </li> </ul> </li> </ul> </td></tr></table> Note the stray </li></ul>. Probably a stack maintained incorrectly or something. Tidy fixes this on Wikimedia sites, of course.
The fault lies somewhere in Parser::formatHeadings, which gives me a headache, so don't rely on me to fix it. :) If anyone wants to find the problem, or maybe rewrite that whole little mess, feel free to attach a patch and I'll look at it.
The validator reports no problems with my test (despite admittedly ugly HTML output). Could you link to a page that the validator fails? * [[m:User:Pathoschild/Sandbox2]] * http://validator.w3.org/check?uri=http%3A%2F%2Fmeta.wikimedia.org%2Fwiki%2FUser%3APathoschild%2FSandbox2
As I say, Tidy will fix it if enabled, as it is on Wikimedia wikis. This is only an issue for third parties that do not use Tidy.
I have found the cause of this problem and have fixed it my own copy of 1.11.0. The problem seems to be that the $prevtoclevel gets messed up right after an unindent. The solution was to add a line to set it properly after unindenting. Add a line after the first call to tocUnindent in Parser.php. The new line should look like: $prevtoclevel = $toclevel; So, before the change, the code looks like: 3597 if( $toclevel<$wgMaxTocLevel ) { 3598 if($prevtoclevel < $wgMaxTocLevel) { 3599 # Unindent only if the previous toc level was shown :p 3600 $toc .= $sk->tocUnindent( $prevtoclevel - $toclevel ); 3601 } else { 3602 $toc .= $sk->tocLineEnd(); 3603 } 3604 } After the change, it looks like: 3597 if( $toclevel<$wgMaxTocLevel ) { 3598 if($prevtoclevel < $wgMaxTocLevel) { 3599 # Unindent only if the previous toc level was shown :p 3600 $toc .= $sk->tocUnindent( $prevtoclevel - $toclevel ); 3601 $prevtoclevel = $toclevel; 3602 } else { 3603 $toc .= $sk->tocLineEnd(); 3604 } 3605 }
Applied in r32387 with a parser test case. Seems good. :)
*** Bug 13623 has been marked as a duplicate of this bug. ***
Went ahead and stuck this on 1.12 branch, r32998 for 1.12.1.