Last modified: 2013-07-04 10:34:15 UTC
DSR expands beyond the tsr for the template element, which causes it to include parts of the table syntax. Test case on the English Wikipedia: {| class="toccolours collapsible collapsed" style="width:100%;" |- ! '''Links to related articles''' |- | {{BBC Television}} |- | {{BBC Radio}} |- | {{BBC News}} |- | {{BBC music ensembles}} |- | {{European Broadcasting Union Members}} |- | {{Department for Culture, Media and Sport}} |- | {{Media in the United Kingdom|comporg}} |- | {{Telecommunications}} |}
Reduced test case: {| | {{BBC Radio}} |}
This is quite ugly actually. Details below. The opening content generated by this template is: <td style="text-align:left;border-left-width:2px;border-left-style:solid;width:100%;padding:0px;;;" class="navbox-list navbox-oddhlist"> Note the <td ... > tag parsed as text. Now, if you look at the source of {{BBC Radio}} template, it uses the {{Navbox}} template. And, if you look at the content of the {{Navbox}} template, you see the source of all problems. The td-tag above that is parsed as string is actually generated in multiple pieces as string and pieced together which means the tokenizer cannot really parse this as td-tag without processing all the associated templates and generating the strings and concatenating them. ----------------------------------------- {{#if:{{{group1|}}}|<th scope="row" class="navbox-group {{{groupclass|}}}" <!-- -->style="{{{basestyle|}}};{{#if:{{{groupwidth|}}}|width:{{{groupwidth}}};}}{{{groupstyle|}}};{{{group1style|}}}"><!-- -->{{{group1}}}</th><td style="text-align:left;border-left-width:2px;border-left-style:solid;|<td colspan=2 style="}}<!-- -->{{#if:{{{groupwidth|}}}||width:100%;}}padding:0px;{{{liststyle|}}};{{{oddstyle|}}};{{{list1style|}}}" <!-- -->class="navbox-list navbox-{{#ifeq:{{{evenodd|}}}|swap|even|{{{evenodd|odd}}}}} {{{listclass|}}}">= ----------------------------------------- Hmmm .... {{Navbox}} is actually a very common template. So, we need to find a hack for this or fix the Navbox template.
Normally the template's output should not affect dsr calculations. The template tag has the correct tsr available, so I think this is simply a bug in the dsr propagation algorithm. The navbox template is a known issue, and can be fixed by tweaking the template to use properly nested (at least to some degree..) parser functions.
I think because of the lost <td> tag, the DOM is thrown off sufficiently so that DSR calculations might not be able to capture all RT-able content. Consider this example: {| |{{BBC Radio}} |row2 |} Now, let us look at the token stream that is sent to the tree builder: <table>, <td>, <bbc-meta-start>, ...., <bbc-meta-end>, </td>, <td>, row2, </td>, </table> But, if you look at the DOM that comes out the tree-builder, it looks like: .... </table>, row2. So, a closing </td> tag that should have matched with the lost td-tag is instead closing the outer table. In the bargain, source wikitext information about the leading "|" before row2 and the table closing "|}" is lost because the tree builder discards the mismatched <td>, </td> and </table> tokens in the end. This rts as: {| |{{BBC Radio}} row2 Now, let us take an example closer to the wikitext found on the BBC page: {| |{{BBC Radio}} |{{echo|row3}} |} The trailing DOM for this looks like this: ... </table> <span data-parsoid="{'src':'|{{echo|row2}}','dsr':[18,32]}" about="#mwt2" typeof="mw:Object/Template">row2</span> And, this roundtrips as: {| |{{BBC Radio}} |{{echo|row2}} I'll examine this more, but when the DOM semantics change, I am not sure if DSR can always robustly fix this up in all situations given that DSR computation relies on DOM structure.
For the benefit of anyone else besides me and Gabriel who is monitoring this ticket :), we decided to try and patch the DOM to deal with lost content. The algorithm will patch the DOM by expanding template scope to patch over problem content so that the entire content (in this case the entire wikitext table) can roundtrip. If we manage to patch up the DOM, DSR computation will remain unchanged and will mark the entire expanded content RT-able. We may have a solution for detecting when we should (and can) do this DOM patchup. Coding and testing in progress.
A DOM fix here: https://gerrit.wikimedia.org/r/#/c/28455
[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]