Last modified: 2014-06-16 18:03:27 UTC
Adding wikitable support for thead, tbody, and tfoot elements would be a harmless enhancement, allowing more sophisticated formatting of tables (both in pages' wikitext, and using style attributes or style sheets). Logically, each element would only need a start tag, and would be closed when the following element starts or the table ends (as "|-" serves for table rows). Possible wikitable shortcuts: thead: |! associated with table headers (but possibly confusing) |^ analogous to GREP start of string |< analogous to XML/HTML tag opening |[ opening bracket representing start tbody: |= fatter version of table row |[ enclosing bracket representing a block tfoot: |_ underscore=bottom line |$ analogous to GREP end of string |> analogous to HTML/XML tag closing |/ analogous to HTML/XML closing tag |] closing bracket representing ending See also Bug 3156: Request not to filter <tbody> and </tbody> codes
Why not use heuristics? Not ideal, but a great improvement over the current situation. Proposal: 1) Any sequence of rows falling at the end of a table and consisting entirely of header cells is a <tfoot>. 2) Any other sequence of rows consisting entirely of header cells is a <thead>. 3) Any other sequence of rows is a <tbody>. Thus you would get, e.g. {| |+ Metasyntactic variables ! Computing |- | Foo |- | Bar |- ! English names |- | Jack |- | Jill |} = <table><caption>Metasyntactic variables</caption> <thead><tr><th>Computing</th></tr></thead> <tbody> <tr><td>Foo</td></tr> <tr><td>Bar</td></tr> </tbody> <thead><tr><th>English Names</th></tr></thead> <tbody> <tr><td>Jack</td></tr> <tr><td>Jill</td></tr> </tbody> </table> which I believe is correct. Any counterexamples?
The only trouble is if the heuristic turns out to be wrong. Unlikely, but possible, and if you don't offer any way around it there will be problems.
Still better than the current setup, and it doesn't complicate wikimarkup (which I think is why this isn't enabled).
sorttable.js uses thead and tfoot to know what portions of a table to not sort. Allowing the use of thead and tfoot would make that table sorting script much easier to integrated with complicated tables.
My original suggestion was to pass thead and /thead, if you can't pass them, can you at least not display them on the output page, in effect ignoring them?
One way would be to translate them with <!-- before and --> after the <thead>, <tbody> etc. and </> closures, or have some way to mark them as non-displayed, so that while they are ignored for functionality, they don't show up on the rendered page.
I'm currently working on an implementation of this bug as per comment Aryeh's comments above. I have encountered considerable problems implementing his suggestion on a line of header cells at the end of a table being a tfoot. The problem is that when the program encounters: ! some !! header !! cells it outputs: <tr><th> some </th><th> header </th><th> cells We would then only find out subsequently whether it was actually a footer. We could perform a simple search and replace but that would be greatly complicated by the possibility of embedded tables within the footer cells. As far as I can see implementing full heuristics would require a almost full rewrite. Or something like: {| |+ Metasyntactic variables ! Computing |- | Foo |- | Bar |- ! English names |- | Jack |- | Jill |= | Footer |}
Created attachment 5421 [details] a structural method to implement structural elements: tbody, thead and tfoot Ignore my previous comments. I've now substantially rewritten the doTableStuff() function, by separating the wiki syntax reading part from the bit that outputs the html. doTableStuff() now collects information about the table into an array which a new function, printTableHtml(), converts into html.
I forgot to mention the patch includes changes to wikibits.js which didn't appear to support tbody, thead or tfoot elements after all. The changes make sortable tables work in Safari v3 and Firefox v3. It needs to be tested in ie6 and other browsers. Btw... {| ! header |} {| ! header |- | content |- ! footer |} but... {| ! header |- | content |- ! header |- |} I did this on purpose just in case people wanted to had headers at the bottom of their tables. It can be changed!
keywords : Patch, need-review
I support the proposal to add these three elements; their availability, with class attributes, will greatly facilitate the use of microformats.
*** Bug 3156 has been marked as a duplicate of this bug. ***
Created attachment 7911 [details] a structural method to implement structural elements: tbody, thead and tfoot v2 Updated patch to apply cleanly to trunk. Fails heaps of parser tests, fixing that now
Created attachment 7912 [details] a structural method to implement structural elements: tbody, thead and tfoot v2 v2 last patch contained unrelated changes
Created attachment 7915 [details] 7912: a structural method to implement structural elements: tbody, thead and tfoot v This ones passes all parsertests (except those which get upset by the new <tbody>). the new html tags are whitelisted now as well. This patch would enable us to migrate to a better tablesorter script, which would fix a lot of the open table sorting bugs.
I think it would be nice to have a new syntax for tfoot and thead rather then (only) hack around current one. Parse with first row in thead: {| |+ Title |- ! Head cell !! Head cell |- | Normal cell || Normal cell |- | Normal cell || Normal cell |} Parse without thead: {| |+ Title |- ! Head cell | Normal cell |- | Normal cell || Normal cell |- | Normal cell || Normal cell |} Parse rows with "|!-" moved to thead (only if in concurrent rows). Parse rows with "|>-" moved to tfoot (only if in concurrent rows). {| |+ Title |!- ! Head cell !! Head cell |!- ! Head cell !! Head cell |- | Normal cell || Normal cell |- ! Head cell not in thead | Norma cell |- | Normal cell || Normal cell |>- | Footer cell || Footer cell |}
Nux: I'd say that it's better to do it on the existing syntax, since I can't see the use case of having a row that looks like a thead but structurally isn't. Fixed in r85922
<tbody> is not working here: https://en.wikipedia.org/w/index.php?title=Wikipedia:Sandbox&oldid=512695930
Hmm... r85922 was reverted on r97145 and we do not have any of <thead>, <tbody> or <tfoot> at https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blame;f=includes/Sanitizer.php;hb=893b41431c46785856b84ca91810f905c21b6831#l355
So when the patch is implement, what syntax would I use to divide a table into two or more row groups using tbody elements? This is not clear from the descriptions above.
Can we get an update, please?
Nobody is currently working on this.
I think this proposal needs a clearer description of use cases, and why those use cases justify the complexity costs in: * the wikitext user interface * the VisualEditor user interface * Parsoid As an example, how would this be sensibly presented in VE?
Gabriel: The use case is out lined in Michael Zajac's initial post (timestamp: 2006-01-24 02:41:52); and in comments 4 & 11. Do you have questions about those? It appeared from comment 15 that this was resolved four years ago; no reason for its reversion has been given here.
Also, the heuristic suggested above won't work, as it's necessary to allow for more than one tbody per table.
(In reply to Andy Mabbett from comment #24) > Gabriel: The use case is out lined in Michael Zajac's initial post > (timestamp: 2006-01-24 02:41:52); and in comments 4 & 11. Do you have > questions about those? What I see there is 1) allows for more sophisticated formatting (comment 1) 2) sorttables not sorting thead / tfoot (comment 4) 3) facilitation of microformats (comment 11) Are 1) and 2) actually still issues? To me it sounds like 2) would only be an issue with a footer, which is relatively rare. Otherwise, detecting a row with <th> elements should not be hard in a script. 3) Is rather nebulous given that you can just as well attach classes to trs. I am asking for is a clear use case. I want to do X, it's not possible because of Y, and it will be possible once thead / tbody / tfoot are supported. This is worth the costs because of Z.
A related use case: allows Parsoid to handle arbitrary table markup in WTS phase. Although my proposal (for the record) would be *not* to add new pipes-and-punctuation markup for <thead> <tfoot> etc, but instead to just allow them to be generated by literal HTML embedded in wikitext, eg https://en.wikipedia.org/wiki/Help:Table#Other_table_syntax Once your table is sufficiently complicated, it's probably best to use literal HTML, IMO. But we still need to permit thead/tfoot/colgroup etc in literal HTML within wikitext.
(In reply to Gabriel Wicke from comment #26) > 1) allows for more sophisticated formatting (comment 1) The main reason I requested this is the ability for an editor to create multiple row groups by adding multiple tbody elements in a table. This would allow grouping data in tables, making these groups accessible to assistive devices like screen readers, allow visual formatting of the groups with CSS (other than redundant inline CSS), and allowing behaviours like collapsing groups. The solution in comment 1 simply automates adding a whole-table tbody element, and does not satisfy the requirement (the HTML DOM implicitly includes a full-table tbody anyway, so this solution is redundant.) Some use-case examples that would benefit from this: * Thousands of list articles that break up lists into separate tables, often set off with article sections. These would benefit by automatically having columns visually aligned, and in accessibility. Examples: http://en.wikipedia.org/wiki/List_of_Jim_Rockford's_answering_machine_gags https://en.wikipedia.org/wiki/List_of_Canadian_provincial_and_territorial_symbols https://en.wikipedia.org/wiki/List_of_field_guns * Similarly, data tables and infoboxes that set off groups using th elements, with inline CSS for visual formatting, with rows and cells containing only hr elements, or multiple methods: https://en.wikipedia.org/wiki/Ukrainian_alphabet#Unicode https://en.wikipedia.org/wiki/Romanization_of_Ukrainian#Tables_of_romanization_systems https://en.wikipedia.org/wiki/Template:Infobox_weapon * Navboxes that use nested tables to create groups: https://en.wikipedia.org/wiki/Template:WWIISovietAFVs * Wiktionaries have thousands of simple and complex inflection tables that need grouping: https://en.wiktionary.org/wiki/слушать#Conjugation https://en.wiktionary.org/wiki/анулирам#Conjugation
(In reply to C. Scott Ananian from comment #27) > Once your table is sufficiently complicated, it's probably best to use > literal HTML, IMO. But grouping table rows is a very simple concept. There is high demand. Editors are already attempting to do this in tens of thousands of tables using complex, inconsistent, inaccessible, inadequate, and inappropriate hacks (rows of table headers, horizontal rules, inline CSS, nested tables, etc.). It should be possible to accomplish this with dead-simple wikitext, and visually format it consistently and automatically in standard style sheets.
@Michael Zajac: nothing related to table parsing in wikitext is simple, unfortunately. So I'm suggesting to concentrate on *making it possible*, and let the template authors and/or VE, etc, worry about making it "dead-simple". But I'm open to suggestions. The HTML elements not currently supported in wikitext are thead, tbody, tfoot, colgroup, and col. If someone would like to open a new wikipage proposing concrete "dead-simple" wikitext syntax for these, I'd be happy to re-evaluate. (But please make your proposal on a wikipage, so that this bugzilla isn't bloated out with endless bikeshedding over tweaks to the syntax.) Note that the original page was reverted (as I understand the history of this bug) because the implementation constructed an entire in-model memory of the table during processing. Wikipedia tables can be *huge*. So any syntax proposal must be able to be parsed without buffering and using as little table context information as possible. Similarly, you should be prepared to demonstrate (using greps over a wikipedia dump, or similar) that the proposed syntax does not break any existing table markup.
@C. Scott Ananian I do appreciate that the parsing and programming are likely very complex. And also that white-flagging the HTML is a good improvement and probably a step towards creating a wikitext syntax for these elements. But wikitable syntax is fairly simple for editors to use, and I hope that these efforts can eventually add a simple way to mark the start of a new tbody (row group), and the other elements. I’m sorry that currently I can’t invest time in this, but thanks for the suggestions on how to proceed.
(In reply to C. Scott Ananian from comment #30) > If someone would like > to open a new wikipage proposing concrete "dead-simple" wikitext syntax for > these, I'd be happy to re-evaluate. (But please make your proposal on a > wikipage, so that this bugzilla isn't bloated out with endless bikeshedding > over tweaks to the syntax.) I note the correct place for such a proposal would be https://www.mediawiki.org/wiki/Requests_for_comment
@Brad -- yes, I thought about mentioning that, but reconsidered; I thought it would probably be more useful to stage a draft in some user's talk space (or similar) first and let people hack on it for a while, before making things formal and hoisting the text into the RfC namespace. I didn't want to discourage contributors by forcing the RfC template and formatting on them right away. But: if you're not afraid of extra process and formatting and are feeling confident in your proposal, then sure throw it directly into RfC space.