Last modified: 2014-07-09 18:14:44 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T50958, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 48958 - Images: parse caption separately all the way to DOM and use DOMFragment encapsulation
Images: parse caption separately all the way to DOM and use DOMFragment encap...
Status: PATCH_TO_REVIEW
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Normal normal
: ---
Assigned To: C. Scott Ananian
:
: 52567 (view as bug list)
Depends on:
Blocks: 54844
  Show dependency treegraph
 
Reported: 2013-05-30 00:40 UTC by Inez Korczyński
Modified: 2014-07-09 18:14 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Inez Korczyński 2013-05-30 00:40:37 UTC
Image tag produced with this wikitext: [[File:Wiki.png|caption<b>123</b>]] should have alt attribute with value "caption123" set.

It would be great for VE team if we could actually get HTML DOM of caption (instead of text with stripped out HTML tags) because then we would be able to provide nice experience when converting from inline image to block image and other way round.
Comment 1 Gabriel Wicke 2013-05-30 07:38:09 UTC
Our plan so far has been to only set the alt if it was explicitly specified using the alt=foo option. The caption HTML DOM will be in the data-mw.caption member. This lets VE set the alt and caption separately, while client-side or server-side postprocessing can set the default alt to the textValue of the caption when rendering for viewing.

In the current implementation parsing the caption to DOM is completely missing for inline images, and only implemented in an implicit way for block-level images (by returning tokens for the figcaption content inline).

We should use a full pipeline to parse captions all the way to DOM, and then use the internal DOMFragment mechanism to preserve these fragments through token transformations. They are then unpacked at the end of DOMPostProcessor processing. As a side effect this will also properly enforce nesting of captions without the hackish closeUnclosedBlockTags helper.
Comment 2 ssastry 2013-05-30 19:27:11 UTC
Sounds reasonable to me.  Now that we have implemented properly-nested-DOM requirement on some wikitext constructs, we should be able to use that for image captions as well.
Comment 3 Andre Klapper 2013-07-04 10:33:25 UTC
[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]
Comment 4 Gabriel Wicke 2013-08-21 17:33:01 UTC
On a related note, we should *not* set an alt attribute for read-only viewing that just contains an image's file name. The alt attribute should only contain proper alternate descriptions of the image that would be useful for users with screen readers.
Comment 5 C. Scott Ananian 2013-09-06 19:03:26 UTC
I think this is a dup of bug 52567, and should probably be resolved as such.
Comment 6 Gabriel Wicke 2013-09-06 19:05:44 UTC
*** Bug 52567 has been marked as a duplicate of this bug. ***
Comment 7 Gabriel Wicke 2013-09-06 19:09:40 UTC
Some notes re accessibility from a recent meeting with Gerardo Capiel:

* Long-term we should try to store the alt text or long description along with the image itself, and use that to populate the alt attribute if none was provided explicitly. This requires significant work in core to store metadata along with image pages.

* For screenreaders it would be good to also provide a longdesc attribute linking to the textual description on the image page.
Comment 8 C. Scott Ananian 2013-12-23 15:05:55 UTC
Open a separate bug for the longdesc issue?

Focusing this bug on the "parse caption to DOM" issue -- test case is:

[[File:Wiki.png|caption<b>123]]

which should really have the </b> in the data-mw attribute.
Comment 9 ssastry 2014-02-19 23:09:25 UTC
https://bugzilla.wikimedia.org/show_bug.cgi?id=61566 is the longdesc bug
Comment 10 C. Scott Ananian 2014-06-18 20:25:48 UTC
This appears to have regressed.  The data-mw.caption value is wikitext again.  It should be parsoid DOM.
Comment 11 Gerrit Notification Bot 2014-07-09 18:14:41 UTC
Change 145029 had a related patch set uploaded by Cscott:
WIP: Parse caption to DOM using recursive wikitext parse.

https://gerrit.wikimedia.org/r/145029

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links