Last modified: 2014-03-24 11:03:19 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T15260, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 13260 - post expand size counted multiple times for nested transclusions
post expand size counted multiple times for nested transclusions
Status: RESOLVED LATER
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.11.x
All All
: Lowest major with 9 votes (vote)
: ---
Assigned To: Tim Starling
: newparser
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-06 01:42 UTC by CBM
Modified: 2014-03-24 11:03 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description CBM 2008-03-06 01:42:15 UTC
This is an issue with the way that the new preprocessor computes template limits. Suppose that page A transcludes B and B does nothing but transclude C. The size of C will be counted twice towards the post-expand counter on page A.  This causes pages that have a setup similar to the one described to run into template limits much sooner than expected.
Comment 1 Brion Vibber 2008-03-06 20:59:10 UTC
Assigning to Tim.
Comment 2 CBM 2008-04-22 13:24:43 UTC
Affects parser functions as well: {{#ifexpr: 1 > 0 | {{Foo}} }} will add twice the size of Foo to the post-expand counter. 
Comment 3 Jesse (Pathoschild) 2009-02-12 07:57:19 UTC
Increased severity; this drastically affects labeled section transclusion on Meta, where it is used for language-specific subpages of large multilingual pages. In that case, a simple template that transcludes a localization (with English fallback) multiplies the size of each localization by 4 times, or 8 times if part of a meta-template like {{language subpage|pt}}. This glitch makes it virtually impossible to cleanly subdivide large pages, which are the ones most in need of subdivision for usability, even if the resulting page has relatively little output.

(multiplied 4 times: {{#if:{{#lst:page|section}}|{{#lst:page|section}}|<fallback>}}, counted twice because it's in a template; this in turn counted twice if part of a meta-template.)
Comment 4 Bryan Baron 2009-09-17 23:17:45 UTC
Still an issue?
Comment 5 CBM 2009-09-18 01:01:34 UTC
Confirmed still an issue.

Steps to verify:
* Create a big page
* Transclude it to page B, check post-expand size of B 
* Transclude only B (one time) to page C, check post-expand size of C
* Post-expand size of C is twice post-expand size of B, should be the same as page B
Comment 6 Geometryguy 2010-01-13 14:23:24 UTC
This continues to be a severe inconvenience. Templates have to be written with opaque code in order to minimize nested transclusions, and some dedicated maintainers of dynamic pages (e.g. content review) have to be constantly vigilant about post-expand sizes to avoid breaking pages which would be nowhere near the limit were it not for this bug.
Comment 7 CBM 2011-04-07 18:58:18 UTC
This is still an issue with the workflow in enwiki, particularly, for the content review (Featured Article, Peer Review) pages where they transclude conversations from subpages onto a "master" page.
Comment 8 Mark A. Hershberger 2011-04-12 16:32:30 UTC
New parser and PHP improvements (HipHop?) are slated that may alleviate the problems.  This is not something we are going to attack in the current parser.
Comment 9 Bawolff (Brian Wolff) 2011-04-13 01:57:33 UTC
(In reply to comment #8)
> New parser and PHP improvements (HipHop?) are slated that may alleviate the
> problems.  This is not something we are going to attack in the current parser.

HipHop is going to help the fact that the hard coded limit programmed into the parser is calculated incorrectly? (OTOH, that limit is pretty huge. It scares me to think that people are reaching it, double counting notwithstanding).
Comment 10 Mark A. Hershberger 2011-04-13 15:13:26 UTC
HipHop would hopefully avoid some problems, but the error in calculation is something that would be fixed in the new parser.
Comment 11 Umherirrender 2012-10-20 17:58:32 UTC
Post expand size is build to hold the size of all expansions done by the parser to reach the html (including sub expansions like templates and parser functions). So this works as aspected.


Some text:

[[A]] contains "Big Text" (wikitext length: 8)

[[B]] contains "{{:A}}" (wikitext length 6)

[[C]] contains "{{:B}}" (wikitext length 6)

The parser is starting at C, expanding B to "{{:A}}" which must be expand to "Big Text". This sub expansion adds it expand size (8) to post-expand include size. After this the parser returned the expand text and the expand process of B will add its expand length (also 8) to post-expand include size, which result in the end of a post-expand include size of 16.

This way is needed to handle the following scenario:

[[A]] contains "Big Text" (wikitext length: 8)

[[B]] contains "{{#if:{{:A}}|A|B}}" (wikitext length 18)

This gives a size of 9 (8 from the sub expand of A and 1 of the expand of B). Without the adding of each sub expand the post expand size in this scenario would be 1, which makes the limit useless, because the limit is build to avoid to big expansions in the process of parsing a page.

There is no error in calculation of the post-expand size, it only contains also the size of each sub expansion.
Comment 12 Tim Starling 2012-11-01 01:15:56 UTC
My position on this is:

* Size multiplied by depth is a defensible cost metric since there will be a factor in the parse time equation which is proportional to it. PHP needs to copy the data at each level, when it concatenates the outputs from sibling subtrees.

* I'm not keen on lifting traditional parser limits such as post-expand include size, since judging by the parse time of existing large articles, the limits were too high to begin with. Lowering the limits would break existing articles, but refusing to raise them (by a factor of expansion depth in this case) is feasible and will help to limit CPU time.

* The limit impacts most strongly on the use of deeply nested metatemplates, and that's a design pattern I'd like to discourage anyway, especially given that Lua will soon be introduced.

After Lua is introduced globally and the more complex templates have been migrated to it, then I think it would be reasonable to consider a severe reduction in parse limits, aimed at a reduction in maximum parse time to 10 seconds or so. In the context of such a project, redefinition or removal of the post expand include size would probably make sense. But by then, we might be switching to Parsoid anyway. So I'm resolving this as "later" for reconsideration at that time.
Comment 13 TMg 2012-11-04 23:33:29 UTC
Maybe it's a good idea to change the name to something that fits the current calculation method?
Comment 14 CBM 2012-11-05 03:01:35 UTC
@Tim Starling (comment 12): Thanks for the update. I want to point out something this affects other than deeply nested metatemplates. More importantly it affects "split" discussion pages, for example when they are divided by day. For example if there are 10 discussion pages transcluded on "A (2012-11-5)" and 10 more transcluded on "A (2012-11-6)" and then page B transcludes both both those "A" pages.  In this case the nesting is trivial - just depth 2 - but the cost on page B is double what it should be.   On enwiki this affects e.g. [[Wikipedia:Peer review]], where there is again a shallow nesting of large-ish discussion pages. If there is a work around for this particular use case it would be very helpful.
Comment 15 Andre Klapper 2014-01-10 00:38:33 UTC
[Using keyword instead of tracking bug for HipHop issues as requested in bug 40926 comment 5. Filter bugmail on this message.]

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links