Last modified: 2014-10-14 21:41:28 UTC
Wikitext: '''foo[[File:Image.jpg|thumb]]bar''' Resulting DOM: <body><b><figure><a><img /></a></figure>bar</b> Block nodes like figure aren't supposed to occur inside of inline-ish (annotation) tags like b. When confronted with this, VisualEditor alienates the image to preserve its sanity (data model consistency), so bolded images can't be edited in VE. Reported by mattk_ on IRC. $ echo "'''foo[[File:Image.jpg|thumb]]bar'''" | node tests/parse.js <!DOCTYPE html> <html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: http://en.wikipedia.org/wiki/Special:Redirect/"><meta property="mw:parsoidVersion" content="0"/><link rel="dc:isVersionOf" href="//en.wikipedia.org/wiki/Main_Page"/><title></title><base href="//en.wikipedia.org/wiki/Main_Page"/><link rel="stylesheet" href="//en.wikipedia.org/w/load.php?modules=mediawiki.skinning.elements|mediawiki.skinning.content|mediawiki.skinning.interface|skins.vector.styles|site|mediawiki.skinning.content.parsoid&only=styles&debug=true&skin=vector"/></head><body data-parsoid='{"dsr":[0,37,0,0]}' lang="en" class="mw-content-ltr mw-body-content" dir="ltr"><b data-parsoid='{"dsr":[0,36,3,3]}'>foo<figure class="mw-default-size" typeof="mw:Image/Thumb" data-parsoid='{"optList":[{"ck":"thumbnail","ak":"thumb"}],"dsr":[6,30,2,2]}'><a href="./File:Image.jpg" data-parsoid='{"a":{"href":"./File:Image.jpg"},"sa":{},"dsr":[8,28,null,null]}'><img resource="./File:Image.jpg" src="//upload.wikimedia.org/wikipedia/en/thumb/7/78/Image.jpg/220px-Image.jpg" height="30" width="220" data-parsoid='{"a":{"resource":"./File:Image.jpg","height":"30","width":"220"},"sa":{"resource":"File:Image.jpg"}}'/></a></figure>bar</b> </body></html>
Not sure this is a bug: > div = document.createElement('div') <div></div> > div.innerHTML="<b><figure></figure></b>" "<b><figure></figure></b>" > div.outerHTML "<div><b><figure></figure></b></div>" > div.innerHTML="<b><div></div></b>" "<b><div></div></b>" > div.outerHTML "<div><b><div></div></b></div>" Note that HTML5 is perfectly happy to let the block element live inside the <b> tag.
At a minimum, we should not create output that has the figcaption inside a figure wrapped into a formatting element. Yet that's what I seem to get with this input: <p>'''foo[[File:Image.jpg|thumb|caption]]bar'''
(Taking the bug, since if it ends up being a real bug with our image spec, I'll fix it in Parsoid. It appears that, to the extent this is a real bug, it's a dup of bug 63642 though.)
What Gabriel points out in comment 2 is exactly the same as bug 63642. The DOM I get from <p>'''foo[[File:Image.jpg|thumb|caption]]bar''' looks like <p><b>foo</b></p><figure><b><a><img /></a><figcaption>caption</figcaption></b></figure><p><b>bar</b></p> (In reply to C. Scott Ananian from comment #1) > Not sure this is a bug: > > > div = document.createElement('div') > <div></div> > > div.innerHTML="<b><figure></figure></b>" > "<b><figure></figure></b>" > > div.outerHTML > "<div><b><figure></figure></b></div>" > > div.innerHTML="<b><div></div></b>" > "<b><div></div></b>" > > div.outerHTML > "<div><b><div></div></b></div>" > > Note that HTML5 is perfectly happy to let the block element live inside the > <b> tag. Yeah, HTML lets you do it, but I don't think it's very sensible, and VE doesn't like it. I also see no reason why the behavior should be radically different depending on whether Parsoid has decided that the text is in a paragraph or free text. In the paragraph case, the bold tag is interrupted (and reapplied inside the figure, but that's bug 63642; really it should be reapplied in the caption instead, or not at all) and the figure isn't bolded, but in the free text case the figure is bolded. Given that the distinction between free text and paragraph text is pretty arbitrary, I don't think this makes much sense. Speaking of, this paragraph vs free text stuff is also pretty broken. The following wikitext: '''foo[[File:Image.jpg|thumb|caption]]bar''' '''foo[[File:Image.jpg|thumb|caption]]bar''' produces a DOM that looks like <body><b>foo<figure><a><img /></a></figure>bar</b>\n\n<b>foo<figure><a><img /></a></figure>bar</b></body> which is wrong: bar and foo will be on the same line but shouldn't be.
Change 162819 had a related patch set uploaded by Subramanya Sastry: (Bug 66749) Fix the test for DU.isGeneratedFigure https://gerrit.wikimedia.org/r/162819
Change 162819 merged by jenkins-bot: (Bug 66749) Fix the test for DU.isGeneratedFigure https://gerrit.wikimedia.org/r/162819
(In reply to Gerrit Notification Bot from comment #6) > Change 162819 merged by jenkins-bot: > (Bug 66749) Fix the test for DU.isGeneratedFigure > > https://gerrit.wikimedia.org/r/162819 This makes it better, but the resulting HTML still violates the Parsoid HTML spec. Old output: <body><b>foo<figure><a><img /></a></figure>bar</b></body> New output: <body><p><b>foo</b></p><figure><a><b><img /></b></a></figure><p><b>bar</b></p></body> At least the figure isn't bolded any more, but there's still a <b> tag between the <a> and the <img> whereas according to the spec, the structure should be <figure><a><img>
Roan: see below .. I don't see a <b> inside the <a> tag. Can you provide the wikitext where you are seeing the problem html? [subbu@earth lib] echo "'''foo[[File:Foobar.jpg|thumb|caption]]bar'''" | node parse --normalize <p><b>foo</b></p> <figure><a href="File:Foobar.jpg"><img src="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" height="26" width="220"/></a><figcaption><b>caption</b></figcaption></figure> <p><b>bar</b></p>
(In reply to ssastry from comment #8) > Roan: see below .. I don't see a <b> inside the <a> tag. Can you provide the > wikitext where you are seeing the problem html? > > [subbu@earth lib] echo "'''foo[[File:Foobar.jpg|thumb|caption]]bar'''" | > node parse --normalize > > <p><b>foo</b></p> > <figure><a href="File:Foobar.jpg"><img > src="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" height="26" > width="220"/></a><figcaption><b>caption</b></figcaption></figure> > <p><b>bar</b></p> Sorry, I'm blind. You're right, the output is correct now.