Last modified: 2013-09-06 19:05:44 UTC
The spec says that captions in inline images should be encapsulated in a 'data-mw' attribute. Related: bug 49942 (tables in image captions), bug 48958 (missing alt), bug 48924 (empty alt captions).
The caption should be parsed to DOM separately to ensure proper nesting. The DOMFragment mechanism might be helpful, although it might not directly work inside figure content (didn't check).
Subbu is working on a fragment parser patch which should be helpful here.
*** This bug has been marked as a duplicate of bug 48958 ***