Last modified: 2012-10-21 19:12:42 UTC
In an RTL Wikipedia a heading like "==<span dir="ltr">C++</span>==" shows correctly as a heading in the text flow, but it is shown as "++C" in the table of contents, because the <span> tag is omitted and an RTL direction is assumed. Not all HTML is omitted - a heading like "==E = mc<sup>2</sup>==" shows correctly both in the heading and in the table of contents.
Confirmed on 1.19wmf1. From the parser: $tocline = preg_replace( array( '#<(?!/?(sup|sub|i|b)(?: [^>]*)?>).*?'.'>#', '#<(/?(sup|sub|i|b))(?: .*?)?'.'>#' ), array( '', '<$1>' ), $safeHeadline ); Those regexen are rather ugly, but let's see if we can't add a very limited allowance for span: $tocline = preg_replace( array( '#<(?!/?(sup|sub|i|b|span dir="ltr")(?: [^>]*)?>).*?'.'>#', '#<(/?(sup|sub|i|b|span dir="ltr"))(?: .*?)?'.'>#' ), array( '', '<$1>' ), $safeHeadline ); But I have to ask - someone went through the effort of not having a ?> in that regex in two different places, and then left a ?> in another place, and I notice that the world hasn't exploded.
This is still an issue. not critical, but annoying.
Thank you for the tip, Dan. Proposed fix submitted in https://gerrit.wikimedia.org/r/#/c/22435/ . I'm working on tests for it.
Patch improved and tests added. Thank to anybody who can review it.
Deployed :)