Last modified: 2014-06-16 18:03:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T6740, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 4740 - thead, tbody, tfoot for wikitable syntax
thead, tbody, tfoot for wikitable syntax
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Low enhancement with 8 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-need-review
: 3156 (view as bug list)
Depends on:
Blocks: semantic-html 16347
  Show dependency treegraph
 
Reported: 2006-01-24 02:41 UTC by Michael Zajac
Modified: 2014-06-16 18:03 UTC (History)
16 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
a structural method to implement structural elements: tbody, thead and tfoot (13.13 KB, patch)
2008-10-12 15:35 UTC, bluehairedlawyer
Details
a structural method to implement structural elements: tbody, thead and tfoot v2 (18.17 KB, patch)
2010-12-14 17:27 UTC, DieBuche
Details
a structural method to implement structural elements: tbody, thead and tfoot v2 v2 (12.22 KB, patch)
2010-12-14 17:29 UTC, DieBuche
Details
7912: a structural method to implement structural elements: tbody, thead and tfoot v (14.18 KB, patch)
2010-12-15 19:11 UTC, DieBuche
Details

Description Michael Zajac 2006-01-24 02:41:52 UTC
Adding wikitable support for thead, tbody, and tfoot elements would be a harmless enhancement, allowing more sophisticated 
formatting of tables (both in pages' wikitext, and using style attributes or style sheets).  

Logically, each element would only need a start tag, and would be closed when the following element starts or the table ends (as 
"|-" serves for table rows).  Possible wikitable shortcuts:

thead:

|!    associated with table headers (but possibly confusing)
|^    analogous to GREP start of string
|<    analogous to XML/HTML tag opening
|[    opening bracket representing start

tbody:

|=    fatter version of table row
|[    enclosing bracket representing a block

tfoot:

|_    underscore=bottom line
|$    analogous to GREP end of string
|>    analogous to HTML/XML tag closing
|/    analogous to HTML/XML closing tag
|]    closing bracket representing ending

See also Bug 3156: Request not to filter <tbody> and </tbody> codes
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-01-03 22:00:43 UTC
Why not use heuristics?  Not ideal, but a great improvement over the current
situation.  Proposal:

1) Any sequence of rows falling at the end of a table and consisting entirely of
header cells is a <tfoot>.
2) Any other sequence of rows consisting entirely of header cells is a <thead>.
3) Any other sequence of rows is a <tbody>.

Thus you would get, e.g.

{|
|+ Metasyntactic variables
! Computing
|-
| Foo
|-
| Bar
|-
! English names
|-
| Jack
|-
| Jill
|}

=

<table><caption>Metasyntactic variables</caption>
<thead><tr><th>Computing</th></tr></thead>
<tbody>
<tr><td>Foo</td></tr>
<tr><td>Bar</td></tr>
</tbody>
<thead><tr><th>English Names</th></tr></thead>
<tbody>
<tr><td>Jack</td></tr>
<tr><td>Jill</td></tr>
</tbody>
</table>

which I believe is correct.  Any counterexamples?
Comment 2 Edward Z. Yang 2007-01-03 22:03:29 UTC
The only trouble is if the heuristic turns out to be wrong. Unlikely, but
possible, and if you don't offer any way around it there will be problems.
Comment 3 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-01-03 22:09:16 UTC
Still better than the current setup, and it doesn't complicate wikimarkup (which
I think is why this isn't enabled).
Comment 4 Forest 2007-01-22 03:20:47 UTC
sorttable.js uses thead and tfoot to know what portions of a table to not sort.
Allowing the use of thead and tfoot would make that table sorting script much
easier to integrated with complicated tables.

Comment 5 Paul Robinson 2007-04-25 19:58:45 UTC
My original suggestion was to pass thead and /thead, if you can't pass them, can
you at least not display them on the output page, in effect ignoring them?
Comment 6 Paul Robinson 2007-04-25 20:01:32 UTC
One way would be to translate them with <!-- before and --> after the <thead>,
<tbody> etc. and </> closures, or have some way to mark them as non-displayed,
so that while they are ignored for functionality, they don't show up on the
rendered page.
Comment 7 bluehairedlawyer 2008-09-28 13:58:48 UTC
I'm currently working on an implementation of this bug as per comment Aryeh's comments above. I have encountered considerable problems implementing his suggestion on a line of header cells at the end of a table being a tfoot. The problem is that when the program encounters:

! some !! header !! cells

it outputs:

<tr><th> some </th><th> header </th><th> cells

We would then only find out subsequently whether it was actually a footer. We could perform a simple search and replace but that would be greatly complicated by the possibility of embedded tables within the footer cells. As far as I can see implementing full heuristics would require a almost full rewrite. Or something like:

{|
|+ Metasyntactic variables
! Computing
|-
| Foo
|-
| Bar
|-
! English names
|-
| Jack
|-
| Jill
|=
| Footer
|}
Comment 8 bluehairedlawyer 2008-10-12 15:35:24 UTC
Created attachment 5421 [details]
a structural method to implement structural elements: tbody, thead and tfoot

Ignore my previous comments. I've now substantially rewritten the doTableStuff() function, by separating the wiki syntax reading part from the bit that outputs the html. doTableStuff() now collects information about the table into an array which a new function, printTableHtml(), converts into html.
Comment 9 bluehairedlawyer 2008-10-12 15:45:55 UTC
I forgot to mention the patch includes changes to wikibits.js which didn't appear to support tbody, thead or tfoot elements after all. The changes make sortable tables work in Safari v3 and Firefox v3. It needs to be tested in ie6 and other browsers.

Btw...

{|
! header
|}

{|
! header
|-
| content
|-
! footer
|}

but...

{|
! header
|-
| content
|-
! header
|-
|}

I did this on purpose just in case people wanted to had headers at the bottom of their tables. It can be changed!
Comment 10 Nicolas Dumazet 2008-10-19 14:49:47 UTC
keywords : Patch, need-review
Comment 11 Andy Mabbett 2010-04-17 22:01:42 UTC
I support the proposal to add these three elements; their availability, with class attributes, will greatly facilitate the use of microformats.
Comment 12 DieBuche 2010-12-10 22:01:03 UTC
*** Bug 3156 has been marked as a duplicate of this bug. ***
Comment 13 DieBuche 2010-12-14 17:27:27 UTC
Created attachment 7911 [details]
a structural method to implement structural elements: tbody, thead and tfoot  v2

Updated patch to apply cleanly to trunk.
Fails heaps of parser tests, fixing that now
Comment 14 DieBuche 2010-12-14 17:29:09 UTC
Created attachment 7912 [details]
a structural method to implement structural elements: tbody, thead and tfoot  v2 v2

last patch contained unrelated changes
Comment 15 DieBuche 2010-12-15 19:11:45 UTC
Created attachment 7915 [details]
7912: a structural method to implement structural elements: tbody, thead and tfoot v

This ones passes all parsertests (except those which get upset by the new <tbody>). the new html tags are whitelisted now as well.
This patch would enable us to migrate to a better tablesorter script, which would fix a lot of the open table sorting bugs.
Comment 16 Nux 2011-02-09 09:23:51 UTC
I think it would be nice to have a new syntax for tfoot and thead rather then (only) hack around current one.

Parse with first row in thead:
{|
|+ Title
|-
! Head cell !! Head cell
|-
| Normal cell || Normal cell
|-
| Normal cell || Normal cell
|}

Parse without thead:
{|
|+ Title
|-
! Head cell
| Normal cell
|-
| Normal cell || Normal cell
|-
| Normal cell || Normal cell
|}

Parse rows with "|!-" moved to thead (only if in concurrent rows). Parse rows with "|>-" moved to tfoot (only if in concurrent rows).
{|
|+ Title
|!-
! Head cell !! Head cell
|!-
! Head cell !! Head cell
|-
| Normal cell || Normal cell
|-
! Head cell not in thead
| Norma cell
|-
| Normal cell || Normal cell
|>-
| Footer cell || Footer cell
|}
Comment 17 DieBuche 2011-04-13 19:50:57 UTC
Nux: I'd say that it's better to do it on the existing syntax, since I can't see the use case of having a row that looks like a thead but structurally isn't. 

Fixed in r85922
Comment 18 Helder 2012-09-15 17:42:25 UTC
<tbody> is not working here:
https://en.wikipedia.org/w/index.php?title=Wikipedia:Sandbox&oldid=512695930
Comment 19 Helder 2012-09-15 17:44:15 UTC
Hmm... r85922 was reverted on r97145 and
we do not have any of <thead>, <tbody> or <tfoot> at
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blame;f=includes/Sanitizer.php;hb=893b41431c46785856b84ca91810f905c21b6831#l355
Comment 20 Michael Zajac 2013-03-10 21:47:36 UTC
So when the patch is implement, what syntax would I use to divide a table into two or more row groups using tbody elements? This is not clear from the descriptions above.
Comment 21 Andy Mabbett 2014-06-12 20:31:22 UTC
Can we get an update, please?
Comment 22 Andre Klapper 2014-06-12 22:07:06 UTC
Nobody is currently working on this.
Comment 23 Gabriel Wicke 2014-06-13 21:16:47 UTC
I think this proposal needs a clearer description of use cases, and why those use cases justify the complexity costs in:

* the wikitext user interface

* the VisualEditor user interface

* Parsoid

As an example, how would this be sensibly presented in VE?
Comment 24 Andy Mabbett 2014-06-14 11:10:01 UTC
Gabriel: The use case is out lined in  Michael Zajac's initial post (timestamp:  2006-01-24 02:41:52); and in comments 4 & 11. Do you have questions about those? 

It appeared from comment 15 that this was resolved four years ago; no reason for its reversion has been given here.
Comment 25 Andy Mabbett 2014-06-14 11:19:33 UTC
Also, the heuristic suggested above won't work, as it's necessary to allow for more than one tbody per table.
Comment 26 Gabriel Wicke 2014-06-15 19:17:37 UTC
(In reply to Andy Mabbett from comment #24)
> Gabriel: The use case is out lined in  Michael Zajac's initial post
> (timestamp:  2006-01-24 02:41:52); and in comments 4 & 11. Do you have
> questions about those? 

What I see there is

1) allows for more sophisticated formatting (comment 1)
2) sorttables not sorting thead / tfoot (comment 4)
3) facilitation of microformats (comment 11)

Are 1) and 2) actually still issues? To me it sounds like 2) would only be an issue with a footer, which is relatively rare. Otherwise, detecting a row with <th> elements should not be hard in a script.

3) Is rather nebulous given that you can just as well attach classes to trs.

I am asking for is a clear use case. I want to do X, it's not possible because of Y, and it will be possible once thead / tbody / tfoot are supported. This is worth the costs because of Z.
Comment 27 C. Scott Ananian 2014-06-16 15:33:06 UTC
A related use case: allows Parsoid to handle arbitrary table markup in WTS phase.

Although my proposal (for the record) would be *not* to add new pipes-and-punctuation markup for <thead> <tfoot> etc, but instead to just allow them to be generated by literal HTML embedded in wikitext, eg https://en.wikipedia.org/wiki/Help:Table#Other_table_syntax

Once your table is sufficiently complicated, it's probably best to use literal HTML, IMO.  But we still need to permit thead/tfoot/colgroup etc in literal HTML within wikitext.
Comment 28 Michael Zajac 2014-06-16 16:23:21 UTC
(In reply to Gabriel Wicke from comment #26)
> 1) allows for more sophisticated formatting (comment 1)

The main reason I requested this is the ability for an editor to create multiple row groups by adding multiple tbody elements in a table. This would allow grouping data in tables, making these groups accessible to assistive devices like screen readers, allow visual formatting of the groups with CSS (other than redundant inline CSS), and allowing behaviours like collapsing groups.

The solution in comment 1 simply automates adding a whole-table tbody element, and does not satisfy the requirement (the HTML DOM implicitly includes a full-table tbody anyway, so this solution is redundant.)

Some use-case examples that would benefit from this:

* Thousands of list articles that break up lists into separate tables, often set off with article sections. These would benefit by automatically having columns visually aligned, and in accessibility. Examples: 

  http://en.wikipedia.org/wiki/List_of_Jim_Rockford's_answering_machine_gags
  https://en.wikipedia.org/wiki/List_of_Canadian_provincial_and_territorial_symbols
  https://en.wikipedia.org/wiki/List_of_field_guns

* Similarly, data tables and infoboxes that set off groups using th elements, with inline CSS for visual formatting, with rows and cells containing only hr elements, or multiple methods:

  https://en.wikipedia.org/wiki/Ukrainian_alphabet#Unicode
  https://en.wikipedia.org/wiki/Romanization_of_Ukrainian#Tables_of_romanization_systems
  https://en.wikipedia.org/wiki/Template:Infobox_weapon

* Navboxes that use nested tables to create groups:

  https://en.wikipedia.org/wiki/Template:WWIISovietAFVs

* Wiktionaries have thousands of simple and complex inflection tables that need grouping:

  https://en.wiktionary.org/wiki/слушать#Conjugation
  https://en.wiktionary.org/wiki/анулирам#Conjugation
Comment 29 Michael Zajac 2014-06-16 16:39:16 UTC
(In reply to C. Scott Ananian from comment #27)
> Once your table is sufficiently complicated, it's probably best to use
> literal HTML, IMO.

But grouping table rows is a very simple concept. 

There is high demand. Editors are already attempting to do this in tens of thousands of tables using complex, inconsistent, inaccessible, inadequate, and inappropriate hacks (rows of table headers, horizontal rules, inline CSS, nested tables, etc.). 

It should be possible to accomplish this with dead-simple wikitext, and visually format it consistently and automatically in standard style sheets.
Comment 30 C. Scott Ananian 2014-06-16 16:54:02 UTC
@Michael Zajac: nothing related to table parsing in wikitext is simple, unfortunately.  So I'm suggesting to concentrate on *making it possible*, and let the template authors and/or VE, etc, worry about making it "dead-simple".

But I'm open to suggestions.  The HTML elements not currently supported in wikitext are thead, tbody, tfoot, colgroup, and col.  If someone would like to open a new wikipage proposing concrete "dead-simple" wikitext syntax for these, I'd be happy to re-evaluate.  (But please make your proposal on a wikipage, so that this bugzilla isn't bloated out with endless bikeshedding over tweaks to the syntax.)

Note that the original page was reverted (as I understand the history of this bug) because the implementation constructed an entire in-model memory of the table during processing.  Wikipedia tables can be *huge*.  So any syntax proposal must be able to be parsed without buffering and using as little table context information as possible.  Similarly, you should be prepared to demonstrate (using greps over a wikipedia dump, or similar) that the proposed syntax does not break any existing table markup.
Comment 31 Michael Zajac 2014-06-16 17:12:11 UTC
@C. Scott Ananian I do appreciate that the parsing and programming are likely very complex. And also that white-flagging the HTML is a good improvement and probably a step towards creating a wikitext syntax for these elements.

But wikitable syntax is fairly simple for editors to use, and I hope that these efforts can eventually add a simple way to mark the start of a new tbody (row group), and the other elements. I’m sorry that currently I can’t invest time in this, but thanks for the suggestions on how to proceed.
Comment 32 Brad Jorsch 2014-06-16 17:13:30 UTC
(In reply to C. Scott Ananian from comment #30)
> If someone would like
> to open a new wikipage proposing concrete "dead-simple" wikitext syntax for
> these, I'd be happy to re-evaluate.  (But please make your proposal on a
> wikipage, so that this bugzilla isn't bloated out with endless bikeshedding
> over tweaks to the syntax.)

I note the correct place for such a proposal would be https://www.mediawiki.org/wiki/Requests_for_comment
Comment 33 C. Scott Ananian 2014-06-16 18:03:27 UTC
@Brad -- yes, I thought about mentioning that, but reconsidered; I thought it would probably be more useful to stage a draft in some user's talk space (or similar) first and let people hack on it for a while, before making things formal and hoisting the text into the RfC namespace.  I didn't want to discourage contributors by forcing the RfC template and formatting on them right away.

But: if you're not afraid of extra process and formatting and are feeling confident in your proposal, then sure throw it directly into RfC space.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links