Last modified: 2014-10-14 21:41:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68749, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 66749 - Parsoid outputs <figure> inside of <b>
Parsoid outputs <figure> inside of <b>
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: C. Scott Ananian
:
Depends on:
Blocks: 54844
  Show dependency treegraph
 
Reported: 2014-06-17 22:21 UTC by Roan Kattouw
Modified: 2014-10-14 21:41 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2014-06-17 22:21:04 UTC
Wikitext: '''foo[[File:Image.jpg|thumb]]bar'''

Resulting DOM: <body><b><figure><a><img /></a></figure>bar</b>

Block nodes like figure aren't supposed to occur inside of inline-ish (annotation) tags like b. When confronted with this, VisualEditor alienates the image to preserve its sanity (data model consistency), so bolded images can't be edited in VE.

Reported by mattk_ on IRC.

$ echo "'''foo[[File:Image.jpg|thumb]]bar'''" | node tests/parse.js

<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: http://en.wikipedia.org/wiki/Special:Redirect/"><meta property="mw:parsoidVersion" content="0"/><link rel="dc:isVersionOf" href="//en.wikipedia.org/wiki/Main_Page"/><title></title><base href="//en.wikipedia.org/wiki/Main_Page"/><link rel="stylesheet" href="//en.wikipedia.org/w/load.php?modules=mediawiki.skinning.elements|mediawiki.skinning.content|mediawiki.skinning.interface|skins.vector.styles|site|mediawiki.skinning.content.parsoid&amp;only=styles&amp;debug=true&amp;skin=vector"/></head><body data-parsoid='{"dsr":[0,37,0,0]}' lang="en" class="mw-content-ltr mw-body-content" dir="ltr"><b data-parsoid='{"dsr":[0,36,3,3]}'>foo<figure class="mw-default-size" typeof="mw:Image/Thumb" data-parsoid='{"optList":[{"ck":"thumbnail","ak":"thumb"}],"dsr":[6,30,2,2]}'><a href="./File:Image.jpg" data-parsoid='{"a":{"href":"./File:Image.jpg"},"sa":{},"dsr":[8,28,null,null]}'><img resource="./File:Image.jpg" src="//upload.wikimedia.org/wikipedia/en/thumb/7/78/Image.jpg/220px-Image.jpg" height="30" width="220" data-parsoid='{"a":{"resource":"./File:Image.jpg","height":"30","width":"220"},"sa":{"resource":"File:Image.jpg"}}'/></a></figure>bar</b>
</body></html>
Comment 1 C. Scott Ananian 2014-06-17 22:26:20 UTC
Not sure this is a bug:

> div = document.createElement('div')
<div>​</div>​
> div.innerHTML="<b><figure></figure></b>"
"<b><figure></figure></b>"
> div.outerHTML
"<div><b><figure></figure></b></div>"
> div.innerHTML="<b><div></div></b>"
"<b><div></div></b>"
> div.outerHTML
"<div><b><div></div></b></div>"

Note that HTML5 is perfectly happy to let the block element live inside the <b> tag.
Comment 2 Gabriel Wicke 2014-06-17 22:28:55 UTC
At a minimum, we should not create output that has the figcaption inside a figure wrapped into a formatting element.

Yet that's what I seem to get with this input:

<p>'''foo[[File:Image.jpg|thumb|caption]]bar'''
Comment 3 C. Scott Ananian 2014-06-17 22:31:35 UTC
(Taking the bug, since if it ends up being a real bug with our image spec, I'll fix it in Parsoid.  It appears that, to the extent this is a real bug, it's a dup of bug 63642 though.)
Comment 4 Roan Kattouw 2014-06-17 22:44:17 UTC
What Gabriel points out in comment 2 is exactly the same as bug 63642. The DOM I get from <p>'''foo[[File:Image.jpg|thumb|caption]]bar''' looks like <p><b>foo</b></p><figure><b><a><img /></a><figcaption>caption</figcaption></b></figure><p><b>bar</b></p>

(In reply to C. Scott Ananian from comment #1)
> Not sure this is a bug:
> 
> > div = document.createElement('div')
> <div>​</div>​
> > div.innerHTML="<b><figure></figure></b>"
> "<b><figure></figure></b>"
> > div.outerHTML
> "<div><b><figure></figure></b></div>"
> > div.innerHTML="<b><div></div></b>"
> "<b><div></div></b>"
> > div.outerHTML
> "<div><b><div></div></b></div>"
> 
> Note that HTML5 is perfectly happy to let the block element live inside the
> <b> tag.
Yeah, HTML lets you do it, but I don't think it's very sensible, and VE doesn't like it. I also see no reason why the behavior should be radically different depending on whether Parsoid has decided that the text is in a paragraph or free text. In the paragraph case, the bold tag is interrupted (and reapplied inside the figure, but that's bug 63642; really it should be reapplied in the caption instead, or not at all) and the figure isn't bolded, but in the free text case the figure is bolded. Given that the distinction between free text and paragraph text is pretty arbitrary, I don't think this makes much sense.

Speaking of, this paragraph vs free text stuff is also pretty broken. The following wikitext:

'''foo[[File:Image.jpg|thumb|caption]]bar'''

'''foo[[File:Image.jpg|thumb|caption]]bar'''

produces a DOM that looks like  <body><b>foo<figure><a><img /></a></figure>bar</b>\n\n<b>foo<figure><a><img /></a></figure>bar</b></body> which is wrong: bar and foo will be on the same line but shouldn't be.
Comment 5 Gerrit Notification Bot 2014-09-25 03:56:58 UTC
Change 162819 had a related patch set uploaded by Subramanya Sastry:
(Bug 66749) Fix the test for DU.isGeneratedFigure

https://gerrit.wikimedia.org/r/162819
Comment 6 Gerrit Notification Bot 2014-09-25 23:36:45 UTC
Change 162819 merged by jenkins-bot:
(Bug 66749) Fix the test for DU.isGeneratedFigure

https://gerrit.wikimedia.org/r/162819
Comment 7 Roan Kattouw 2014-10-14 20:00:23 UTC
(In reply to Gerrit Notification Bot from comment #6)
> Change 162819 merged by jenkins-bot:
> (Bug 66749) Fix the test for DU.isGeneratedFigure
> 
> https://gerrit.wikimedia.org/r/162819

This makes it better, but the resulting HTML still violates the Parsoid HTML spec.

Old output: <body><b>foo<figure><a><img /></a></figure>bar</b></body>

New output: <body><p><b>foo</b></p><figure><a><b><img /></b></a></figure><p><b>bar</b></p></body>

At least the figure isn't bolded any more, but there's still a <b> tag between the <a> and the <img> whereas according to the spec, the structure should be <figure><a><img>
Comment 8 ssastry 2014-10-14 21:36:54 UTC
Roan: see below .. I don't see a <b> inside the <a> tag. Can you provide the wikitext where you are seeing the problem html?

[subbu@earth lib] echo "'''foo[[File:Foobar.jpg|thumb|caption]]bar'''" | node parse --normalize

<p><b>foo</b></p>
<figure><a href="File:Foobar.jpg"><img src="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" height="26" width="220"/></a><figcaption><b>caption</b></figcaption></figure>
<p><b>bar</b></p>
Comment 9 Roan Kattouw 2014-10-14 21:41:28 UTC
(In reply to ssastry from comment #8)
> Roan: see below .. I don't see a <b> inside the <a> tag. Can you provide the
> wikitext where you are seeing the problem html?
> 
> [subbu@earth lib] echo "'''foo[[File:Foobar.jpg|thumb|caption]]bar'''" |
> node parse --normalize
> 
> <p><b>foo</b></p>
> <figure><a href="File:Foobar.jpg"><img
> src="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" height="26"
> width="220"/></a><figcaption><b>caption</b></figcaption></figure>
> <p><b>bar</b></p>

Sorry, I'm blind. You're right, the output is correct now.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links