Last modified: 2011-04-30 01:20:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T22036, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 20036 - php error in LinkHolderArray
php error in LinkHolderArray
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.15.x
All All
: Lowest normal (vote)
: ---
Assigned To: Tim Starling
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-08-02 08:53 UTC by Gero Scholz
Modified: 2011-04-30 01:20 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gero Scholz 2009-08-02 08:53:26 UTC
if an extension tries to clone the MW parser, MW wants to merge LinkHolderArrays.

This leads to a php error because the cloning command (something like "$myParser = clone $parser;") oviously does not call the constructor of the LinkHolderArray. The reason is probably that the LinkHolderArray in general is only constructed on demand.

The following bugfix works, but maybe a better solution can be found:

<pre>
	/**
	 * Merge another LinkHolderArray into this one
	 */
	function merge( $other ) {
		// add this!!
		// protect against uninitialized instance (which can happen if parser is 'cloned')
		if (!isset($this->interwikis)) {
			$this->internals = array();
			$this->interwikis = array();
			$this->size = 0;
			$this->parent  = $other->parent;
		}
		// end !!
		
		foreach ( $other->internals as $ns => $entries ) {
			$this->size += count( $entries );
			if ( !isset( $this->internals[$ns] ) ) {
				$this->internals[$ns] = $entries;
			} else {
				$this->internals[$ns] += $entries;
			}
		}
		$this->interwikis += $other->interwikis;
	}
</pre>
Comment 1 Subfader 2009-08-02 21:34:33 UTC
that fix makes extensions work. i can confirm it for 1.16alpha. can't confirm the interwiki thingy.
bad line break here in //protect against...  though.
Comment 2 Brion Vibber 2009-08-03 16:30:35 UTC
I'm a little unclear on what's going on in this bit... Tim can you take a peek and confirm? Thanks!
Comment 3 Tim Starling 2009-08-04 03:13:14 UTC
The clone operation in PHP is a shallow copy, it copies member object handles instead of whole objects. So this:

$a = (object)array( 'm' => (object)array('n' => 1) );
$b = clone $a;
$a->m->n = 2;
print $b->m->n;

will print "2". It's basically equivalent to assigning each member variable in turn:

$b = new stdclass;
$b->m = $a->m;

This means that if you do this from a tag hook:

$parser = clone $wgParser;
$lineStart = true;
$clearState = false;
$parser->parse($text, $title, $options, $lineStart, $clearState);

then you mess up the state of any child objects that $wgParser was storing. I think the reporter is triggering this total destruction of parser state from an extension, and misdiagnosing the result.

If so, I think this is a WONTFIX. I used to recommend cloning the parser but it's since become obvious that it's a really bad idea, especially after MW 1.12. You can do it if you call clearState(), wiping out any problematic objects, that's basically how MessageCache manages it. But you can't just call random functions and expect them to work. My current thinking on this is that cloning should be replaced with calls to appropriate re-entrant functions, such as recursiveTagParse(). 
Comment 4 Nikola Kovacs 2009-08-11 21:26:16 UTC
This is probably a stupid question, but why don't you implement the __clone function to deep copy the parser? recursiveTagParse() is not always sufficient, especially with something as complicated as DPL.

In my case I wanted to save the html rendered inside my tag extension to the database, but recursiveTagParse() stripped out all the internal links. These were later reinserted properly on the original page, but the html saved to the database didn't have them. Switching to parse() fixed it.
Comment 5 Tim Starling 2009-09-04 02:19:55 UTC
Deep cloning could be very slow indeed, due to the need to copy complex data structures in mTplDomCache. Also it would be subject to a high rate of bitrot, since the code would never be used on Wikimedia, and it would have to be updated every time someone added an object to the parser state. If parse() works for you then I don't see a problem, I think I'd have to hear a more convincing argument.
Comment 6 Juliano F. Ravasi 2009-09-04 03:06:22 UTC
I was discussing this yesterday with Tim on IRC. The "fix" is much simpler than that (and much more secure) the real problem is in Parser::__destruct():

154	function __destruct() {
155		if ( isset( $this->mLinkHolders ) ) {
156			$this->mLinkHolders->__destruct();
157		}
158		foreach ( $this as $name => $value ) {
159			unset( $this->$name );
160		}
161	}

There is no need to call LinkHolderArray::__destruct() in line 156, since it will be unset() anyways in the following loop. In fact, probably the whole function is needless, but the doc comment says something about circular references... so I dunno. PHP does reference counting and its garbage collector also handles cycles. It will call __destruct() automatically if, and only if, the reference count drops to zero. In fact, this code will call LinkHolderArray::__destruct() twice under almost all conditions.

If lines 155-157 above are erased, this specific issue will be solved. The problem is that when the cloned parser dies, the call in line 156 destroys the LinkHolderArray that was shared between both the clone and the original.

I suggest these lines be removed (which will also resolve this bug), but not intending to promote cloned Parser usage (that as Tim explained, it is deprecated), instead because the code is really really unnecessary, and does something hardly justifiable, PHP documentation says that __destruct() is a magic method that is called automatically, and doesn't endorse manually calling it:

http://br2.php.net/manual/en/language.oop5.decon.php#language.oop5.decon.destructor

So, would it be possible to reconsider this bug, but with this different fix above?

In my case, I fixed my extension to use Parser::recursiveTagParse() conditionally (when included) and the global $wgParser when not including and it is known to be safe. I also had to avoid and work around some MW functions that destroys the state of the global parser (wfMsgWikiHtml(), wfMsgExt() with 'parse' option, etc.).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links