Last modified: 2014-09-23 23:09:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T28207, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 26207 - Split magic linking out of core; create new magic linking extension
Split magic linking out of core; create new magic linking extension
Status: NEW
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Normal minor with 3 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-need-review
: 30360 (view as bug list)
Depends on: 29473
Blocks: 29145
  Show dependency treegraph
 
Reported: 2010-12-02 18:24 UTC by MZMcBride
Modified: 2014-09-23 23:09 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Magical Extension (2.11 KB, text/plain)
2011-05-25 04:33 UTC, William Demchick
Details
Magical Extension (rv2) (2.32 KB, text/plain)
2011-05-25 20:46 UTC, William Demchick
Details
Magical Extension (rv3) (2.69 KB, text/plain)
2011-05-25 23:57 UTC, William Demchick
Details
Magical Extension (rv4) (3.09 KB, text/plain)
2011-05-26 22:57 UTC, William Demchick
Details
Disable magic linking in core (temporary) (497 bytes, patch)
2011-05-27 02:56 UTC, William Demchick
Details
Magical Extension (rv5) with 30360 (4.37 KB, patch)
2011-08-14 14:41 UTC, User:Docu
Details

Description MZMcBride 2010-12-02 18:24:47 UTC
The ISBN/RFC/whatever magic linking voodoo really ought not be in core. It's a goofy feature that dates way, way back to r40.

I suggest splitting the magic linking voodoo out into an extension and adding a genericized functionality that could resolve bugs like bug 224.
Comment 1 Bryan Tong Minh 2010-12-02 18:43:44 UTC
I don't think the magic linking in core is necessarily evil. However, if it currently isn't, then it should be made configurable and disableble.
Comment 2 Phil Boswell 2010-12-04 18:17:57 UTC
Either you would have to manage to keep the syntax identical or you will have to come up with some transition plan for updating what must by now be millions of instances of the old syntax to whatever new syntax you choose.
Comment 3 Niklas Laxström 2010-12-04 18:30:49 UTC
(In reply to comment #2)
Nobody has proposed to change the syntax. Only that the current usage might be moved into an extension, where also new things similar to this could be added (although I would recommended against it).
Comment 4 MZMcBride 2010-12-04 19:04:13 UTC
(In reply to comment #3)
> (In reply to comment #2)
> Nobody has proposed to change the syntax. Only that the current usage might be
> moved into an extension, where also new things similar to this could be added
> (although I would recommended against it).

Actually, Aryeh has suggested eliminating the syntax in favor of templates (e.g. {{ISBN|XXXXX}}). A few other bugs in this tracker have suggested using ParserFunction-type syntax (e.g. {{#ISBN:XXXX}}). An extension has the advantage of being able to be installed across all wikis easily, rather than relying on users to create and maintain templates (there are still no global templates). An extension also has an inherent disable/enable feature (in reply to comment #1). If you were creating this feature today, you'd probably do it in JavaScript (or at least I would).

Whether Wikimedia or any individual wiki would install this extension to replace the functionality is not as important as the current system degrades gracefully. There are a lot of options to move forward here, but I think the first fundamental question to answer is whether this functionality is appropriate for core.

I'm not sure what you're recommending against: adding more cases of magic link syntax or moving this functionality into an extension.
Comment 5 Niklas Laxström 2010-12-04 19:06:29 UTC
The first, adding more magic link syntax. It's also open question if we even can move this to an extension without causing too big disruption.
Comment 6 Platonides 2010-12-04 19:12:43 UTC
We could provide a way to hook into that piece of the parser and add the current ones to something called CoreMagicLinking, just as was done with parserfunctions.

Such thing is purposefully not hookable for being evil.
Comment 7 William Demchick 2011-05-25 04:33:21 UTC
Created attachment 8582 [details]
Magical Extension

It does seem to me that having magic linking in the core is inappropriate for a number of reasons, many noted above.  Attached is a simple extension that replaces the core functionality.  I have tested it with a modified MediaWiki that lacks the magic linking in the core.

A few notes about the extension:
    - it easily supports "legacy" magic linking (i.e. ISBN, RFC, PMID) via $wgMagicalClassic
    - it can be easily extended (example below)
    - the code is a little more general, and less hackey :)

An example of a custom linker (I have tested this in my LocalSettings.php):

function myCustomLinker( $matches ) {
        $num = ltrim($matches[1], '0');
        $url = 'http://bugzilla.wikimedia.org/show_bug.cgi?id=' . $num;
        return '<a href=\'' . $url . '\'>' . $matches[0] . '</a>';
}

$wgMagicalLinkers[] = array(
        'linker' => 'myCustomLinker',
        'pattern' => '@bug ([0-9]+)@'
);

The name of the extension is "Magical".
Comment 8 Platonides 2011-05-25 15:31:33 UTC
Seems to do the job. Although I'm not convinced about the efficiency of using preg_match_all.

Thanks for your contribution.

Some notes:

Do not use  == null to test if the entry is present, that will produce a notice. Use isset() instead.

The name of the main function (Magical::pbt) should be much clear.

ISBN is not normalised.

You are not providing the css classes.

The callbacks are safe thanks to the limited scope of the regex. Maybe add some redundant htmlspecialchars?


If you want to consider a design change, I would have used per-parser state added in ParserFirstCallInit, like parser functions and tag hooks, probably making a single combined regex there (so you don't need to scan the text N times).
Comment 9 William Demchick 2011-05-25 20:46:09 UTC
Created attachment 8584 [details]
Magical Extension (rv2)

Thanks for the feedback, Platonides.

I fixed many of the things you mentioned.  (I wasn't quite sure what you meant by "redundant htmlspecialchars".)

Regarding performance: I did a quick and dirty test, and my version seems to go approximately two times as fast as the core version.  While I am pleased with this result, I have no idea why it is the case...

If you get a chance, it would be great if some could double check the performance.
Comment 10 William Demchick 2011-05-25 22:16:48 UTC
I did some additional testing, and my version is significantly worse for certain types of input.  I'll do some more hacking...
Comment 11 William Demchick 2011-05-25 23:57:46 UTC
Created attachment 8585 [details]
Magical Extension (rv3)

This should fix performance problems considerably.  It is now comparable performance wise with core magic linking.
Comment 12 Platonides 2011-05-26 21:48:07 UTC
You are not doing it in the structured way of placing extensions into files (no problem, it's easy to add later), but things like
 if( ! isset( $wgMagicalLinkers ) )
are since they produce a vulnerability if the host is configured with register globals.

Also, you should check that MEDIAWIKI is defined before executing anything.

It makes no sense having this line:
$wgMagicalLinkers = array_reverse( $wgMagicalLinkers );

Magical::createPattern() should be lazy-loaded, or done by the function if there's a function for registering magic words.

Using Xml::escapeTagsOnly is worse than doing no escaping at all.

With no escaping, it looks like a html injection vector but when looking at the regex it is actually secure (my previous mention to redundant calls to htmlspecialchars) http://www.mediawiki.org/wiki/Security_for_developers#Demonstrable_security

Escaping with Xml::escapeTagsOnly it looks like it is secure, so that would encourage eg. changing the regex for a more insecure one, as it is escaped, but the delimiter you use is ' but escape for ".

A comment on the $unsafePattern would be appropiate. It's quite clever.

Note that you no longer need to use the x modifier. Also, I would use / instead of @ as it is more common. The fact that @ has to be escaped on those regex may bite some people.
Comment 13 William Demchick 2011-05-26 22:57:54 UTC
Created attachment 8588 [details]
Magical Extension (rv4)

I think this fixes most of your concerns.  Many thanks for your extensive feedback. :)
Comment 14 William Demchick 2011-05-27 02:56:57 UTC
Created attachment 8589 [details]
Disable magic linking in core (temporary)

In case anyone is interested in testing this extension out, here's a quick patch to disable magic linking in Parser.php.
Comment 15 p858snake 2011-06-18 10:41:05 UTC
Removing 1378, 2340, 2391, 3663, 4595, 6535, 8160, 8758, 10866, 10867, 12900, 13195, 13873, 18813, 18814, 19439, 20441, 26644, 28950, 29025 as depends on... they have nothing to do with moving magic linker to a extension.
Comment 16 p858snake 2011-06-18 10:43:20 UTC
-tracking, there is nothing tracking about this bug. +patch.
Comment 17 p858snake 2011-06-18 11:09:54 UTC
@William Demchick: do you currently have commit access? it might be handy (and get a few more eyes on it) if you work on it directly in svn.
Comment 18 User:Docu 2011-08-14 14:41:39 UTC
Created attachment 8918 [details]
Magical Extension (rv5) with 30360

Sample patch based on rv4 to include [[Bugzilla:30360]] (please double check).
Comment 19 Brion Vibber 2011-08-15 17:39:02 UTC
*** Bug 30360 has been marked as a duplicate of this bug. ***
Comment 20 Sumana Harihareswara 2011-11-16 01:26:39 UTC
Docu, could you please attach your patch as a plaintext diff per https://www.mediawiki.org/wiki/Patch#Posting_a_patch ?  Thank you.
Comment 21 Roan Kattouw 2011-11-16 10:50:21 UTC
(In reply to comment #20)
> Docu, could you please attach your patch as a plaintext diff per
> https://www.mediawiki.org/wiki/Patch#Posting_a_patch ?  Thank you.
Since it's a new file / new extension that isn't really necessary.
Comment 23 au 2012-06-17 18:41:08 UTC
Hi Docu, thank you for the patch!

As you may already know, MediaWiki is currently revamping its PHP-based parser
into a "Parsoid" prototype component, to support the rich-text Visual Editor
project:

   https://www.mediawiki.org/wiki/Parsoid
   https://www.mediawiki.org/wiki/Visual_editor

Folks interested in enhancing the parser's capabilities are very much welcome
to join the Parsoid project, and contribute patches as Git branches:

   https://www.mediawiki.org/wiki/Git/Tutorial#How_to_submit_a_patch

Compared to .diff attachments in Bugzilla tickets, Git branches are much easier
for us to review, refine and merge features together.

Each change set has a distinct URL generated by the "git review" tool, which
can be referenced in Bugzilla by pasting its gerrit.wikimedia.org URL as a
comment.

If you run into any issues with the patch process, please feel free to ask on
irc.freenode.net #wikimedia-dev and the wikitext-l mailing list. Thank you!
Comment 24 MZMcBride 2013-03-11 05:38:08 UTC
(In reply to comment #1)
> I don't think the magic linking in core is necessarily evil. However, if it
> currently isn't, then it should be made configurable and disableble.

This got split out to bug 45942 ("Magic links" RFC, PMID and ISBN should be gated by configuration).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links