Last modified: 2010-05-15 15:42:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T10451, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 8451 - parser passes 'UNIQ' tokens to hook, instead of text
parser passes 'UNIQ' tokens to hook, instead of text
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.8.x
PC Linux
: Normal normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-01-01 07:20 UTC by Ittay Dror
Modified: 2010-05-15 15:42 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ittay Dror 2007-01-01 07:20:23 UTC
created a parser function:
public function foreach2Hook(&$parser, $text, $pattern, $replacement, $insep =
',', $outsep = ',' ) {
                print($text);
}

used it in a page like:
{{ #foreach2: <pre>test</pre> }}

the printed string is a UNIQ..QINU identifier

since the actual content is in $matches, a local variable of strip, i can't
access it
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-01-01 17:09:42 UTC
I think you want to call $parser->mStripState->unstripBoth( $text )?
Comment 2 Ittay Dror 2007-01-02 06:24:42 UTC
shouldn't the parser do it for me? i should get either unprocessed text or
completely processed. 
Comment 3 Rob Church 2007-01-02 11:38:56 UTC
You need to look again at the code which is registering the hook callback in the
first place...there are a number of options available to tell the parser what it
should pass to you, and what you're going to pass back to it.
Comment 4 Ittay Dror 2007-01-04 14:02:31 UTC
function setHook( $tag, $callback ) - no parameters
function setFunctionHook( $id, $callback, $flags = 0 ) - flags can only be
SFH_NO_HASH
Comment 5 David 2007-09-23 21:40:05 UTC
Getting the same problem with a clean install of MW v1.11.0.  And I've got the same question as Ittay:  Shouldn't the input the hooked function receives already be processed?  Additionally, I've got to ask: is this behavior intended by design?  Is the solution, that Simetrical provided, documented in the code documentation, on meta.wikimedia.org, and/or mediawiki.org?  And why doesn't setFunctionHook() allow one to specify the level of parsing of the input parameters (i.e. parsed, raw, or something in between)?
Comment 6 joshua bacher 2007-11-06 11:55:08 UTC
I hit the same problem while working on setHook using the parser class.

my code:

$wgHooks['ParserBeforeStrip'][]         = 'tsRegisterInlineQueries'; // execute inline query

function tsRegisterInlineQueries( ) {
        global $wgParser;
        $wgParser->setHook( 'ask', 'testIQ' ); 
        return true;
}

function testIQ($val, $params, &$parser){
        return "blabla";
}    

Now the wikimarkup:
<ask>blub</ask> should be replaced by: "blabla" but isn't.

I think i have hunted it down: there is the line in Parser.php in the parse() method:
303:                wfRunHooks( 'ParserBeforeStrip', array( &$this, &$text, &$this->mStripState ) );
304:                $text = $this->strip( $text, $this->mStripState );

in Line 303 the setHook will be executed, in Line 304 the strip function will be called. the StripState contains all the replacements Strings.

Now in strip the Replacement Strings will change, BUT: the calling parse function will never recognice since the header of the strip function looks like this:

561:        function strip( $text, $state, $stripcomments = false , $dontstrip = array () ) {

the state is not given as a pointer. any change to it, will get lost!

changing line 561 to:
561:        function strip( $text, &$state, $stripcomments = false , $dontstrip = array () ) {

will solve the problem. This should be a problem everywhere where strip is called from, since the replacement strings will get lost.

the following diff should fix it:
561c561
<       function strip( $text, &$state, $stripcomments = false , $dontstrip = array () ) {
---
>       function strip( $text, $state, $stripcomments = false , $dontstrip = array () ) {


cheers and have fun joshua bacher
Comment 7 joshua bacher 2007-11-06 13:13:53 UTC
The $state variable in the strip function was changed in the following revision from a reference to a variable. 

http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Parser.php?revision=17880&view=markup

i tested it with MW 1.10 and MW 1.11 and the provided test code does not work like expected. it always shows 
up the placeholder indicated by the string \x07UNIQ. 

the patch provided is broken. Find a better one here:
http://bacher.bash-it.de/download/parser.patch

cheers josh
Comment 8 Steve Sanbeg 2007-11-06 22:39:10 UTC
From the diff, it looks like one reference was dropped in that revision; re-added in r27280
Comment 9 joshua bacher 2007-11-07 08:42:13 UTC
the revision does not fix the bug. the changed line 
415 	                 $text = $this->strip( $text, &$this->mStripState )

does not solve it. All function calls to strip() need the $state (especially the parse() function for me) variable passed by reference. 
If not the general ReplacementArray will not change in the calling function and therefore the placeholder UNIQ will show up instead of 
the intended replacement because it is not replaced
Comment 10 joshua bacher 2007-11-07 09:38:46 UTC
darn. the change to $state in the strip() header happened from 217569 to r27280
r1=17569&r2=17820&pathrev=27280">http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Parser.php?r1=17569&r2=17820&pathrev=27280
Comment 11 David 2007-11-08 20:33:51 UTC
After messing around with extensions a bit, here's the issue with the UNIQ place-holder text.  The policy of tags is that any text returned from them be in HTML.  Parser functions have both the input and the returned text parsed.

Let's take the following:

  {{#rflxsv:<tag>text</tag>}}

where "rflxsv" is a parser function that just takes the first argument and spits it back out, and "tag" is a tag that creates an HTML anchor (like Cite or the DPL/DPL2 extension).  Without "UNIQ" placeholder, the end result is that links generated by the tag extension (along other tags that aren't normally allowed in wiki-markup like <form> and <input>) end up being double-parsed/escaped.  So instead of getting:

  <a href="someURL">text</a>

You instead get:

  &lt;a href="someURL"&gt;text&lt;/a&gt;
Comment 12 joshua bacher 2007-11-12 08:56:00 UTC
once again. i only tested the behaviour with my whole extension. After just limiting it to the Functions that i gave as a a proof of concept i found out that, with a fresh mw 1.11 installation candidate, the poc fails. 

this means: somehow my code breaks the things not the parser class.

cheers josh

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links