Last modified: 2004-09-21 05:00:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T2060, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 60 - Don't parse templates recursively
Don't parse templates recursively
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Page editing (Other open bugs)
unspecified
All All
: High major with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: parser
: 464 (view as bug list)
Depends on:
Blocks: 70 80 202 283
  Show dependency treegraph
 
Reported: 2004-08-15 20:03 UTC by Gordon P. Hemsley
Modified: 2004-09-21 05:00 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Don't parse templates recursively (1.48 KB, patch)
2004-09-16 19:55 UTC, Wil Mahan
Details
Patch for REL1_3 (1.44 KB, patch)
2004-09-18 03:26 UTC, Wil Mahan
Details

Description Gordon P. Hemsley 2004-08-15 20:03:34 UTC
When previewing an edit of an article that contains template parameters inside
an article link in it ([[{{text}}]] or [[{{disambig}}]] doesn't matter), it
causes the following error:
Fatal error: Call to a member function on a non-object in
/home/wikipedia/htdocs/test/w/includes/OutputPage.php on line 814
Comment 1 JeLuF 2004-08-15 20:28:40 UTC
[[{{text}}]] only crashes if Template:text does not exist.
What happens:

{{text}} is checked for existence. since it does not exist, it's replaced by an
edit link to Template:text, which is parsed into <!--LINK Template:text bla bla
bla-->. Than, [[<!--LINK Template:text bla bla bla-->]] is parsed into another
link, and <!-- is not valid inside of titles.

OutputPage.php 1.165 and Skin.php 1.255 contain code that prevents the fatal
error. The generated output is not nice, though: It contains the bla bla bla
part of the above mentioned link.
Comment 2 Gordon P. Hemsley 2004-08-15 22:19:30 UTC
(In reply to comment #1)
> [[{{text}}]] only crashes if Template:text does not exist.
That's not true, because [[{{disambig}}]] causes the same error, and
{{disambig}} exists.

> OutputPage.php 1.165 and Skin.php 1.255 contain code that prevents the fatal
> error. The generated output is not nice, though: It contains the bla bla bla
> part of the above mentioned link.
Are these only in CVS or are they on http://test.wikipedia.org/ too?
Comment 3 Antoine "hashar" Musso (WMF) 2004-08-16 05:04:29 UTC
test.wikipedia.org is updated from time to time using the cvs code
Comment 4 Brion Vibber 2004-09-12 00:19:41 UTC
The fix for bug 267 (removing { and } from legal title chars again) has some
effect on this, but it's not totally fixed. The [[{{text}}]] case still produces
awful broken HTML. Will investigate...
Comment 5 Brion Vibber 2004-09-12 01:01:19 UTC
The problem is that when replaceInternalLinks() is operating, the text it's
looking at has had templates stripped. The replacement text is something like
"NaodW29-item21773b7b0640b5", which is perfectly legal in a title so the link is
generated.

Then, the text gets unstripped and the instances of "NaodW29-item21773b7b0640b5"
in the <a href> tag (1.3) or <!-- LINK --> placeholder (1.4) get replaced with
the template text... which may be itself a link or link placeholder, screwing
everything up.

A simple "fix" is to use a strip placeholder that's not legal in links, however
this has a couple problems. Namely, it means you can't use a template or
template parameter as the source of a link; this is problematic anyway, but
people try it and it might be better to make it work properly than to take it out.

Also, there are other places where text may be replicated or moved into HTML
tags where the unstripping causes breakage. A more formal solution that covers
all might be good.
Comment 6 Brion Vibber 2004-09-12 21:10:21 UTC
*** Bug 464 has been marked as a duplicate of this bug. ***
Comment 7 Brion Vibber 2004-09-12 21:16:18 UTC
Changed summary
Comment 8 Wil Mahan 2004-09-16 19:55:18 UTC
Created attachment 41 [details]
Don't parse templates recursively

I was looking at what Brion described, and here are my thoughts:

This looks like a fundamental result of the way templates are stripped
and replaced by placeholders. Stripping things like <math>, <nowiki>,
and <pre> make sense because they are logically separate from the 
surrounding text; for example, a link would never begin inside a
<math>...</math> pair and end outside. People do want to do this sort
of thing with templates and template arguments, however; there is really
no logical boundary between the template and the rest of the text.

For things like <nowiki> et al, stripping and substituting a placeholder 
is a way of saying "this is parsed, so don't touch it". But perhaps
stripping out templates and template arguments and replacing them with
placeholders is the wrong thing to do.

What if instead of recursively parsing templates and template arguments, we
only recursively _substitute_ them? We could then parse the result once,
non-recursively.

This (experimental, proof-of-concept) patch replaces templates and
template arguments by just shoving in their text.  No parsing is done
immediately, so no placeholders are used. With the the patch,
internalParse() is no longer recursive; replaceVariables() does
not actually do any stripping and parsing, but simply substitutes
variables recursively.

Does anyone see any obvious problems with this approach?
Comment 9 JeLuF 2004-09-16 20:02:44 UTC
Right approach, in my eyes.

I think removeHTMLtags should be called after including templates. 
Else one could insert "bad" HTML using templates.
Comment 10 Wil Mahan 2004-09-16 22:08:00 UTC
Thanks for taking a look. The reason I call stripHTMLtags() before
replaceVariables() is that that's the order used in internalParse(), and I was
trying to preserve the existing behavior.

I think the existing patch is safe from inserting bad HTML, at least in these
two cases:

1. Trying to include a template with bad HTML: we start out in internalParse(),
which calls replaceVariables(). The template with bad HTML gets inserted by
braceSubstitution(). Then right before we substitute any template arguments with
the recursive call to replaceVariables(), the bad HTML gets removed by my
stripHTMLtags() call.
2. Trying to insert bad HTML as a template parameter: if it doesn't initially
get removed by the call in internalParse(), the bad HTML in the argument will
get removed by my call in argSubstitution().
Comment 11 Jamie Bliss 2004-09-17 23:29:58 UTC
2 things:
(1) This bug prevents partial syntax parsing of templates (translation: If you put a row of a table in a 
template, it will appear as data in the last cell ie, 
http://endeavour.zapto.org/furc/Wiki/index.php/User:Astronouth7303/New_roster )

(2) Can you please backport this to 1.3?
Comment 12 Antoine "hashar" Musso (WMF) 2004-09-17 23:35:42 UTC
(In reply to comment #11)
> 2 things:
> (1) This bug prevents partial syntax parsing of templates (translation: If you
put a row of a table in a 
> template, it will appear as data in the last cell ie, 
> http://endeavour.zapto.org/furc/Wiki/index.php/User:Astronouth7303/New_roster )

Jamie Bliss case is solved by the patch submitted (Parser.php:1.272).
Comment 13 Wil Mahan 2004-09-18 03:26:59 UTC
Created attachment 44 [details]
Patch for REL1_3

1) Here is an untested patch with the change for REL1_3. Warning: my fix is a
lot more intrusive than might appear from the small number of lines changed. I
would be wary of using it in production before there is more testing.
2) Another bug that should be fixed by my patch (which is in CVS HEAD) is bug
80.
3) At the end of braceSubstitution(), notice that we suspend the LinkCache
before recursing and then resume it afterwards. I think with my patch this may
no longer be necessary. Since no parsing is done when including templates,
replaceInternalLinks() doesn't get called in this code path. I'm not certain
about this, however, so if someone could confirm I would appreciate it.
Comment 14 JeLuF 2004-09-18 19:07:17 UTC
Tested in CVS HEAD:
Bug 60  Fixed
Bug 70  Fixed
Bug 80  Fixed
        new issue occured while testing, I'll file a 
        bug report for this new problem.
Bug 41  Fixed
Bug 283 Fixed
Bug 161 Fixed
Bug 131 Fixed
Bug 302 Fixed
Comment 15 JeLuF 2004-09-18 19:16:45 UTC
Another one in CVS HEAD:

Bug 16  Fixed

Going to check the patch in REL1_3.
Comment 16 JeLuF 2004-09-18 21:16:53 UTC
Tested in REL1_3:
Bug 16  Fixed
Bug 41  **NOT FIXED**
Bug 60  Fixed
Bug 70  Fixed
Bug 80  Fixed
Bug 283 Fixed
Bug 161 Fixed
Bug 131 Fixed
Bug 302 Fixed
Comment 17 River Tarnell 2004-09-21 02:52:29 UTC
This patch makes the issue at bug 266 more serious.  Instead of template
sections editing the wrong section, _all_ edit links (even for the article)
after a template inclusion with a section header are wrong.  Example at
[[User:Kate/sandbox2]].

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links