Last modified: 2014-11-13 17:42:12 UTC
http://parsoid.wmflabs.org:8001/latestresult/zh/EOS contains [[佳能 EOS 300D|300D<p>Digital Rebel<p>Kiss Digital]] block content inside links is problematic: > div = document.createElement('div') <div></div> > div.innerHTML = '<a href="foo">foo<p>bar</a>' "<a href="foo">foo<p>bar</a>" > div.outerHTML "<div><a href="foo">foo</a><p><a href="foo">bar</a></p></div>"
This is probably related to bug 47326... but bug 47326 is fixable. Not sure that this particular bug is fixable, since our DOM fundamentally does not let us represent block content inside an <a> tag. OTOH, it's interesting that we currently round trip: [[佳能 EOS 300D|300D<p>Digital Rebel<p>Kiss Digital]] to [[佳能 EOS 300D|300D<p>Digital Rebel]]<p>[[佳能 EOS 300D|Kiss Digital]]</p> ie, we managed to deal with the first <p> somehow. We might be able to recombine these tags in the html2wt phase.
Note that this is not true in production. The p tags in the first half are round-tripped with a meta tag based trick that is not safe when content can be edited. This trick is mainly used to hide noise in round-trip testing without selective serialization. parse.js defaulted to this trick so far, which I just submitted a patch for.
Also, selective serialization is thrown off by overlapping source ranges (it duplicates the nested paragraph source). Not sure if that can be improved on by forbidding range overlaps in the dsr pass.