Last modified: 2011-08-04 22:21:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T19119, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 17119 - class Parser: senseless use of non-existing regexp back reference
class Parser: senseless use of non-existing regexp back reference
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Low minor (vote)
: ---
Assigned To: Dan Collins
http://svn.wikimedia.org/viewvc/media...
: patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-22 21:35 UTC by seth
Modified: 2011-08-04 22:21 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
simple patch that applies the suggested change (612 bytes, patch)
2011-08-04 22:16 UTC, Dan Collins
Details

Description seth 2009-01-22 21:35:28 UTC
There's a small error in line
  '/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2',
in file 
  http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/parser/Parser.php?view=markup

$2 (or \2 respectively) is not defined here, because zero-width look-ahead assertions (?=...) are non-capturing patterns.

I suggest one of the following solutions

  '/(.) (\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2',
  '/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 ',
  '/(?<=.) (?=\\?|:|;|!|%|\\302\\273)/' => ' ',

I don't know which of those is the best. I guess the first one is the slowest.

Btw. for uniformity you could modify the lines
  '/(\\302\\253) /' => '\\1&nbsp;',
  '/&nbsp;(!\s*important)/' => ' \\1', #Beware of CSS magic word !important, bug #11874.
analogously.
Comment 1 seth 2009-01-22 22:13:25 UTC
1. Oh, &amp;nbsp; is parsed here in bugzilla..., sorry I did not know that.
2. and additional to that you could do a little speed-up by grouping the singe-char alternatives by char classes. So the possible solutions would be

  '/(.) ([?:;!%]|\\302\\273)/' => '\\1&amp;nbsp;\\2',
or
  '/(.) (?=[?:;!%]|\\302\\273)/' => '\\1&amp;nbsp;',
or
  '/(?<=.) (?=[?:;!%]|\\302\\273)/' => '&amp;nbsp;',
Comment 2 seth 2009-01-22 22:17:02 UTC
Oops, sorry again, &nbsp; is not parsed, it was just a copy&paste error.
Third try:

A solution would be

  '/(.) ([?:;!%]|\\302\\273)/' => '\\1&nbsp;\\2',
or
  '/(.) (?=[?:;!%]|\\302\\273)/' => '\\1&nbsp;',
or
  '/(?<=.) (?=[?:;!%]|\\302\\273)/' => '&nbsp;',
Comment 3 Dan Collins 2011-08-04 22:16:47 UTC
Created attachment 8886 [details]
simple patch that applies the suggested change

Attached patch is incredibly simple and removes the redundant \\2. I checked the rest of Parser.php, there are no other incidences of this bug. This changes no behaviour whatsoever. I tested it anyway and all parsertests still pass.
Comment 4 Dan Collins 2011-08-04 22:21:36 UTC
Fixed in r93925.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links