Last modified: 2014-09-24 00:46:30 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3310, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 1310 - Recursive tags in extensions.


Summary:	Recursive tags in extensions.

Status:	NEW

Product:	MediaWiki
Classification:	Unclassified
Component:	Parser (Other open bugs)
Version:	1.4.x
Hardware:	All All

Importance:	Lowest enhancement with 7 votes (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:	need-parsertest, parser, patch, patch-need-review

Duplicates:	11528 20350 21426 35173 (view as bug list)
Depends on:
Blocks:	20707
	Show dependency tree / graph

Reported:	2005-01-11 04:09 UTC by andy
Modified:	2014-09-24 00:46 UTC (History)
CC List:	16 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
patch for the Parser.php file, with line numbers (1.57 KB, patch) 2005-01-11 04:16 UTC, andy	Details
more recent patch (1.48 KB, patch) 2005-01-11 15:30 UTC, andy	Details
diff file from mediawiki 1.11.0 (3.98 KB, patch) 2007-09-22 15:00 UTC, aki.ikgw	Details
Allows extension tags to be nested (5.70 KB, patch) 2012-01-28 03:30 UTC, sharon.dagan	Details
Show Obsolete (3) Add an attachment (proposed patch, testcase, etc.)

Description andy 2005-01-11 04:09:03 UTC

I made an extension to allow for easier discussions (an example is here
http://moacad.com/wiki/index.php?title=Talk:Changelog) but in doing so I noticed
that the current extractTags function in Parser.php would extract the tag
starting at the beginning of the tag (<tag>) and would stop at the 'first' end
tag (</tag>). For example if someone edited a page and included the text
'<tag><tag>foo</tag>bar</tag>' it would think the tag was only
'<tag><tag>foo</tag>' and would leave off the extra '</tag>' on the end.

Comment 1 andy 2005-01-11 04:16:45 UTC

Created attachment 201 [details]
patch for the Parser.php file, with line numbers

I'm not sure how to include the code for html comments with it so I just put
that separate with an if tag. If anyone would like to change it go ahead. It
may also be sloppy formating or code, this is my first patch so I don't know
much about the style it should be in.

Comment 2 andy 2005-01-11 15:30:11 UTC

Created attachment 202 [details]
more recent patch

I think this should be more usefull. It replaces lines 234-285 in the 1.4beta4
version of Parser.php

Comment 3 Brion Vibber 2005-07-10 22:58:55 UTC

Not sure this would be desireable; may have side effects.

Anyway the patch is very out of date...

Comment 4 Ævar Arnfjörð Bjarmason 2005-10-01 10:38:03 UTC

Not a patch, removing patch keyword.

Comment 5 Zach Dennison 2007-07-18 18:45:04 UTC

Isn't this bug fixed in 1.9 and maybe earlier with Parser::recursiveTagParse()?

Comment 6 aki.ikgw 2007-09-22 15:00:00 UTC

Created attachment 4140 [details]
diff file from mediawiki 1.11.0

Comment 7 Steve Sanbeg 2007-10-01 21:31:07 UTC

*** Bug 11528 has been marked as a duplicate of this bug. ***

Comment 8 Alexandre Emsenhuber [IAlex] 2009-08-25 19:12:57 UTC

*** Bug 20350 has been marked as a duplicate of this bug. ***

Comment 9 Alexandre Emsenhuber [IAlex] 2009-11-07 11:05:18 UTC

*** Bug 21426 has been marked as a duplicate of this bug. ***

Comment 10 Chad H. 2010-07-17 11:00:22 UTC

Removing need-review keyword. Patches are ancient and not very useful, so I've marked them obsolete.

Comment 11 sharon.dagan 2011-03-18 17:49:48 UTC

This bug still exist in 1.16.2 - any plans to fix it?

Comment 12 Mark A. Hershberger 2011-03-18 20:08:27 UTC

Probably exists in 1.17 (about to be released), too.  Can you check out a copy of HEAD from subversion to check?

  svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3 wiki

Also, if you're running into this, let us know the particulars of your case.

Comment 13 Bawolff (Brian Wolff) 2011-03-18 20:13:26 UTC

/me wonders if the proposed change in this bug is really the desired behaviour. Disallowing nested start <tag>'s seems sane to me.

Comment 14 Purodha Blissenbach 2011-03-18 20:53:02 UTC

(In reply to comment #13)
> /me wonders if the proposed change in this bug is really the desired behaviour.
> Disallowing nested start <tag>'s seems sane to me.

That depends on semantics and the code handling them.

Comment 15 sharon.dagan 2011-03-18 22:07:35 UTC

Working with the latest code from trunk/phase3, as suggested.
My test case extension is a very basic tag hook:

File: Bug1310_TestCase.php

<?php

$wgHooks['ParserFirstCallInit'][] = 'onParserFirstCallInit';

function onParserFirstCallInit( &$parser ) {
    $parser->setHook( 'foo', 'onTag' );
    return true;
}

function onTag( $input, $args, $parser, $frame ) {
        wfDebug( $input );
	return 'xxx';
}

?>

in LocalSettings.php the extension is loaded the normal way.

The input for the test case is:
'<foo>Begin1... <foo>Begin2... ...End2</foo> ...End2</foo>'

The $input that gets into onTag() should be:
'Begin1... <foo>Begin2... ...End2</foo> ...End2'

However,
In wfDebug I get: 'Begin1... <foo>Begin2... ...End2'
And in the browser I get: 'xxx...End1</foo>'

Comment 16 sharon.dagan 2011-03-18 22:10:17 UTC

OOPS! (why can't I edit my comment?)

The input for the test case is:
'<foo>Begin1... <foo>Begin2... ...End2</foo> ...End1</foo>'

The $input that gets into onTag() should be:
'Begin1... <foo>Begin2... ...End2</foo> ...End1'

However,
In wfDebug I get: 'Begin1... <foo>Begin2... ...End2'
And in the browser I get: 'xxx...End1</foo>'

Comment 17 p858snake 2011-04-30 00:10:06 UTC

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Comment 18 John Du Hart 2011-08-24 15:40:18 UTC

-patch, no non-obselete patches.

Comment 19 theom3ga 2011-10-09 15:41:27 UTC

Same bug here. I'm developing an extension to semantically tag the document, so there are going to be recursive tags, and I'm getting this bug too, test case is the same as in Comment 15.

Any alternative to tag extensions for this?

Comment 20 Bawolff (Brian Wolff) 2011-10-11 19:39:09 UTC

>Any alternative to tag extensions for this?

Well parser function style things perhaps ({{#foo:...}})

Comment 21 sharon.dagan 2012-01-28 03:30:56 UTC

Created attachment 9920 [details]
Allows extension tags to be nested

This patch allows tag extensions to be nested. Only the most outer tags are parsed, everything in between is passed to the callback.
 
Given the wiki text "<foo>123<foo>456</foo>789</foo>", foo's callback will be called with the text "123<foo>456</foo>789".

Comment 22 Mark A. Hershberger 2012-02-10 18:27:15 UTC

Thanks for this patch!  We've been in a code slush (not quite a freeze) for a few weeks so we're just getting around to looking at these.

We're also doing a lot of parser work, so I'm not sure how relevant this is, but I'll ask them to take a look.

Comment 23 Gabriel Wicke 2012-02-11 15:33:04 UTC

From an implementation standpoint, simply matching up the closest start/end tag is definitely easier than building a stack to enable nested tag pairs. I am also not convinced that nested tag pairs would be a good UI design, as it seems to make the distinction of regular wiki content and input to an extension harder than necessary.

Could you present a compelling use case that demonstrates the need to use the same tags both to delimit the extension inputs and the input itself?

Comment 24 Gabriel Wicke 2012-02-11 15:35:23 UTC

The last sentence should naturally end with *and in the input itself*. An edit button would be handy sometimes.

Comment 25 Bawolff (Brian Wolff) 2012-02-11 20:15:23 UTC

(In reply to comment #23)
> From an implementation standpoint, simply matching up the closest start/end tag
> is definitely easier than building a stack to enable nested tag pairs. I am
> also not convinced that nested tag pairs would be a good UI design, as it seems
> to make the distinction of regular wiki content and input to an extension
> harder than necessary.
> 
> Could you present a compelling use case that demonstrates the need to use the
> same tags both to delimit the extension inputs and the input itself?

There are two examples I could see where this may be wanted

* <ref> tags so people could do nested ref stuff without {{#tag:ref hackery.
* <source> tag's for when highlighting xml-ish things that have a <source> inside them (since they would usually have a closing source tag as well, but that's more like accidentally fixing an issue then actually fixing an issue).

But I also tend to agree that it may not be worth the effort.

Comment 26 Daniel Friesen 2012-02-11 22:20:52 UTC

That as a <source> fix sounds to me more like a hack to fix a non-issue to me. The <source> isn't written to explicitly do anything special with any <source> tags inside of it so that does not sound like the thing we should be aiming for. (And sounds like it would break if someone used <source> to document example arguments to the opening <source> tag)

Switching over to a complete even tag matching could change the behaviour of existing content -- ie: <foo><foo></foo> suddenly having different behaviour -- so I'd reject the patch we have on those grounds alone.

We probably also want to write a test to make sure that <nowiki><nowiki></nowiki> doesn't suddenly start turning everything after it into nowiki content when it was written expecting it to display a "<nowiki>" tag verbatim in the page for documentation.

I think that if we do implement recursive tags, it's going to have to be an explicit op-in by extensions feature. ie: Only tags with a specific option will be parsed recursively. We can enable it on <ref> but we may not want to enable it for <source>, and definitely don't want it on <nowiki>.

Comment 27 Gabriel Wicke 2012-02-11 23:26:49 UTC

I also share Daniel's concerns about changing the behavior of existing content.

In the longer term, I think that it would be desirable to add a fully parsed input mode for extension tag contents. This could take the form of a token stream or a DOM fragment built from those. Extensions could choose between plain-text input and tokens, so this would be opt-in. This is also very close to how this is currently handled in the Parsoid parser (http://mediawiki.org/wiki/Parsoid), although there still are some issues to solve (e.g. an unclosed html comment in the extension content). Nested nowiki is covered by parser tests already, and works as expected.

Are there extensions that need nested extension tags, but otherwise unparsed input?

Comment 28 Brion Vibber 2012-03-13 21:48:08 UTC

*** Bug 35173 has been marked as a duplicate of this bug. ***

Comment 29 Quim Gil 2014-05-17 00:21:02 UTC

This discussion has been stalled during more than two years. Has there been any change helping to its resolution in a way or another?

Comment 30 MZMcBride 2014-05-17 00:26:51 UTC

(In reply to Quim Gil from comment #29)
> This discussion has been stalled during more than two years. Has there been
> any change helping to its resolution in a way or another?

Had there been, this bug would likely already reflect such a change. I'm not sure what you're asking.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links