Last modified: 2013-01-02 16:31:26 UTC
If there is <h2> tag on the page, action=parse doesn't return proper byte offsets for sections Look here, for example: http://uk.wikipedia.org/w/api.php?action=parse&prop=sections&oldid=11303632 And corresponding page for parsing is http://uk.wikipedia.org/w/index.php?oldid=11303632
Odd that "byteoffset" is actually the offset in Unicode codepoints. The problem is actually in includes/parser/Parser.php, method formatHeadings(). It pulls out all the <h#> tags from the parsed HTML, but uses the parsed-to-DOM representation of the original wikitext to try to calculate the byteoffset. This parsed-to-DOM representation, however, doesn't include DOM structure for any raw <h#> tags from the original wikitext, so when it tries to find the DOM node for one of those it searches to the end of the wikitext without finding it. Which also screws up all subsequent headers. Roan, it looks like you added this back in 2009, any ideas here? Otherwise I'll just put together a patch that skips trying to calculate byteoffset when $sectionIndex === false.
related: bug 25203
Not just related, it's a duplicate. *** This bug has been marked as a duplicate of bug 25203 ***