Last modified: 2007-06-10 02:06:01 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3887, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1887 - <math>, <$ext> etc. and ISBN in thumbnails/frames is broken.
<math>, <$ext> etc. and ISBN in thumbnails/frames is broken.
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: High major with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: parser, patch-reviewed
: 834 1442 2440 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-04-13 17:10 UTC by Ævar Arnfjörð Bjarmason
Modified: 2007-06-10 02:06 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch for math attached. ;) (700 bytes, patch)
2005-04-22 00:49 UTC, David Majnemer
Details
Pwned ISBN (1.44 KB, patch)
2005-04-22 01:32 UTC, David Majnemer
Details
Fixed my math patch (622 bytes, patch)
2005-04-22 01:47 UTC, David Majnemer
Details
parserTest (2.69 KB, patch)
2005-04-22 02:39 UTC, David Majnemer
Details
ISBN AGAIN (1.49 KB, patch)
2005-04-23 02:37 UTC, David Majnemer
Details
Rewrote the whole function (2.48 KB, patch)
2005-04-26 03:00 UTC, David Majnemer
Details
Fixed uglyness (1.79 KB, patch)
2005-04-26 19:23 UTC, David Majnemer
Details
fixed my ugly test patch (1.84 KB, patch)
2005-04-26 19:41 UTC, David Majnemer
Details
RAWR! (1.78 KB, patch)
2005-04-26 20:21 UTC, David Majnemer
Details

Description Ævar Arnfjörð Bjarmason 2005-04-13 17:10:55 UTC
The problem is that Parser::magicISBN() is called after Linker::makeThumbLinkObj
has been run on the text, which means that the magicISBN function will expand on
any ISBN number it recognizes, including those inside XHTML tags like alt="" or
title="", the same thing happens with <math>.


For example, "[[Image:Badimg.jpeg|thumb|<math>2+2</math>]]" is turned into:

<div class="thumb tright"><div style="width:202px;"><a
href="/mw/HEAD/wiki/Image:Badimg.jpeg" class="internal" title="<span
class="texhtml>2 + 2</span>"><img
src="/mw/HEAD/images/thumb/9/98/200px-Badimg.jpeg" alt="<span class="texhtml>2 +
2</span>" width="200" height="300" longdesc="/mw/HEAD/wiki/Image:Badimg.jpeg"
/></a>  <div class="thumbcaption" ><div class="magnify" style="float:right"><a
href="/mw/HEAD/wiki/Image:Badimg.jpeg" class="internal" title="Enlarge"><img
src="/mw/HEAD/skins/common/images/magnify-clip.png" width="15" height="11"
alt="Enlarge" /></a></div><span class="texhtml">2 + 2</span></div></div></div>
Comment 1 Ævar Arnfjörð Bjarmason 2005-04-13 17:11:11 UTC
*** Bug 1442 has been marked as a duplicate of this bug. ***
Comment 2 Ævar Arnfjörð Bjarmason 2005-04-13 17:11:19 UTC
*** Bug 834 has been marked as a duplicate of this bug. ***
Comment 3 David Majnemer 2005-04-21 01:47:02 UTC
What should [[Image:Badimg.jpeg|thumb|<math>2+2</math>]] properly turn into?
That should make it easier to fix.
Comment 4 David Majnemer 2005-04-21 02:20:15 UTC
Checked my own copy, I don't see this bug.
Comment 5 Ævar Arnfjörð Bjarmason 2005-04-21 04:20:40 UTC
David Majnemer: The same as it does now, except things in title="" should always
be stripped of [<>"]
Comment 6 Rowan Collins [IMSoP] 2005-04-21 14:22:47 UTC
Copying my comment from bug 1442, since you've marked the duplicates this way round:

> > The problem is that Parser::magicISBN() is called after Linker::makeThumbLinkObj
> > has been run on the text, which means that the magicISBN function will expand on
> > any ISBN number it recognizes, including those inside XHTML tags like alt="" or
> > title="".
> 
> In that case, it should be possible to do things the other way round, since the
> image code should strip the HTML tags back out of the alt/title attributes. Kind
> of ugly, but I think it might work.
Comment 7 David Majnemer 2005-04-21 20:32:11 UTC
Ok, now that I know what should be done, I can fix it.
Comment 8 David Majnemer 2005-04-22 00:17:36 UTC
The only way to get this stupid bug gone is we set it up so it omited the value.
That.. can be solved by restructuring some code (the proper way to do it) or the
hacky way of adding an omit to the unstip function. I will try both and give
patches so y'all got a choice ;)
Comment 9 David Majnemer 2005-04-22 00:33:17 UTC
Fixed Math :)
Comment 10 David Majnemer 2005-04-22 00:49:16 UTC
Created attachment 442 [details]
Patch for math attached. ;)

Patch for math attached. ;)
This patch is ONLY for math, I have not done ISBN yet, I will do so when
somebody tells me what is the correct behaviour for ISBN. :)
Comment 11 ABCD 2005-04-22 00:56:43 UTC
For ISBN:
input:
[[Image:Example.jpg|thumb|ISBN 0123456789]]

current output:

<div class="thumb tright">
<div style="width: 182px;"><a href="/wiki/Image:Example.jpg" title="&lt;a href="
class="internal">ISBN
0123456789</a>"&gt;<img
src="http://upload.wikimedia.org/wikipedia/en/thumb/f/fd/180px-Example.jpg"
alt="&lt;a href=" class="internal">ISBN 0123456789" width="180" height="171"
longdesc="/wiki/Image:Example.jpg" /&gt;

<div class="thumbcaption">
<div class="magnify" style="float: right;"><a href="/wiki/Image:Example.jpg"
class="internal" title="Enlarge"><img
src="/skins/common/images/magnify-clip.png" alt="Enlarge" height="11"
width="15"></a></div>
<a href="/w/index.php?title=Special:Booksources&amp;isbn=0123456789"
class="internal">ISBN 0123456789</a></div>
</div>
</div>

correct output:

<div class="thumb tright">
<div style="width: 182px;"><a href="/wiki/Image:Example.jpg" class="internal"
title="ISBN 0123456789"><img
src="http://upload.wikimedia.org/wikipedia/en/thumb/f/fd/180px-Example.jpg"
alt="ISBN 0123456789" longdesc="/wiki/Image:Example.jpg" height="171"
width="180"></a>
<div class="thumbcaption">
<div class="magnify" style="float: right;"><a href="/wiki/Image:Example.jpg"
class="internal" title="Enlarge"><img
src="/skins/common/images/magnify-clip.png" alt="Enlarge" height="11"
width="15"></a></div>
<a href="/w/index.php?title=Special:Booksources&amp;isbn=0123456789"
class="internal">ISBN 0123456789</a></div>
</div>
</div>
Comment 12 David Majnemer 2005-04-22 00:59:44 UTC
I am on it!
Comment 13 Rowan Collins [IMSoP] 2005-04-22 01:17:30 UTC
(In reply to comment #10)
> This patch is ONLY for math, I have not done ISBN yet, I will do so when
> somebody tells me what is the correct behaviour for ISBN. :)

Unfortunately, although only small, there are 2 errors in this patch: first,
note that there is both $label *and* $alt; we don't mind <math></math> in the
caption (i.e. $label), only in the attributes, so we only want to extract it in
$alt; secondly (and I don't blame you for missing this one), there is duplicated
code between makeImageLinkObj() and makeThumblinkObj() - really, code like this
should be in both, or rather, should be in a part of code that's not spread
across both. [I started rewriting that whole slab of code, but never quite finished]


The "correct behaviour for ISBN" is surely just nothing - that is, within the
alt/title attributes, the text "ISBN 0123456789X" should stay exactly as it is;
but obviously there's no real way of doing that, because it will be picked up by
the magicISBN() function later whether we like it or not. I guess you could use
a placeholder of some sort, but these have problems of their own.

In fact, at one stage the whole image was being stripped out, presumably causing
certain things not to happen even in the *caption* part, and causing bug 1217
and bug 1219

The other alternative, as I say, is just to parse such things before the
internalLink code has run, so that they will be HTML, and stripped from the
alt/title attributes anyway. I've stayed up too late, so my mind's whirling too
much to work out if that would cause any problems of its own or not...
Comment 14 David Majnemer 2005-04-22 01:19:07 UTC
Rowan, thanks for the tips. However, I will fix the ISBN function :)
Comment 15 Rowan Collins [IMSoP] 2005-04-22 01:21:50 UTC
(In reply to comment #14)
> Rowan, thanks for the tips. However, I will fix the ISBN function :)

I'll be interested to see what approach you take.
Comment 16 David Majnemer 2005-04-22 01:32:31 UTC
Created attachment 443 [details]
Pwned ISBN

FIXED!
ISBN is fixed with two simple regexs. :)
Comment 17 David Majnemer 2005-04-22 01:47:34 UTC
Created attachment 444 [details]
Fixed my math patch

Fixed my math patch, to the best of my knowledge, the effects of this bug are
gone.
Comment 18 David Majnemer 2005-04-22 02:39:35 UTC
Created attachment 445 [details]
parserTest

This patches the parser test file. Have fun.
Comment 19 Rowan Collins [IMSoP] 2005-04-22 12:56:48 UTC
(In reply to comment #16)
> Created an attachment (id=443) [edit]
> Pwned ISBN
> 
> FIXED!
> ISBN is fixed with two simple regexs. :)

I might be wrong, but won't this only fix captions whose entire content is an
ISBN string? i.e. it looks to be matching things like <<alt="ISBN
0123456789X">>, but not things like <<alt="This is a book which has the ISBN
0123456789X; which is nice">> Unfortunately, my 1.5 test install is in a state
of development where it can't actually display images, so I've only got the code
to go from. It also seems to me that this could potentially trigger false
positives in the actual text, but I guess it's not a piece of text that people
are very likely to use (extending the regex to cover foo="phrase with an ISBN
etc in it" *might* run the risk of picking up too much, though, I'm not sure)
Comment 20 David Majnemer 2005-04-23 02:37:10 UTC
Created attachment 449 [details]
ISBN AGAIN

Fixed ISBN again. It is totally lazy and works great. It is designed to be
smart enough to to mistake user input and an actual ISBN. Still two regexs.
This should fix the problem with other words.
Comment 21 David Majnemer 2005-04-26 03:00:23 UTC
Created attachment 456 [details]
Rewrote the whole function

I rewrote the whole function, I thought the way it did things was silly and
over complicated so I wrote a new function.
Comment 22 Ævar Arnfjörð Bjarmason 2005-04-26 17:19:28 UTC
Regarding the "parserTest" attachment:

* You made some minor formatting change to "Interwiki link encoding conversion
(bug 1636)"
* You damaged the "Inter-language links" test by changing "zh:食品" to "zh:??"
(two question marks), please make sure your editor doesn't make changes like
that to characters it doesn't recognize.

Regarding the "Rewrote the whole function" attachment:

* Please don't make minor whitespace changes, it makes the diff hard to read.

Comment 23 David Majnemer 2005-04-26 19:23:45 UTC
Created attachment 460 [details]
Fixed uglyness

It is nolonger ugly as hell.
Comment 24 David Majnemer 2005-04-26 19:41:24 UTC
Created attachment 461 [details]
fixed my ugly test patch

Fixed my horrible patch, this one is nicer.
Comment 25 David Majnemer 2005-04-26 20:21:06 UTC
Created attachment 462 [details]
RAWR!

Fixed the stupid thing again! Damn you world!
Comment 26 Ævar Arnfjörð Bjarmason 2005-04-26 21:04:18 UTC
It doesn't work for more than one ISBN link per line, the following fails:

"""
[[Image:Foo.png|thumb|ISBN 213 ISBN 213 ISBN 213 ... ]]
ISBN 123-213 ISBN 123-213  ISBN 123-213 ...
"""
Comment 27 Ævar Arnfjörð Bjarmason 2005-06-17 22:48:59 UTC
*** Bug 2440 has been marked as a duplicate of this bug. ***
Comment 28 Christian Thiele 2005-06-18 02:21:39 UTC
The problem occurs also when using external links starting with "mailto:" for mail 
addresses.
Comment 29 Ævar Arnfjörð Bjarmason 2005-06-21 13:48:37 UTC
(In reply to comment #24)
> Created an attachment (id=461) [edit]
> fixed my ugly test patch

Applied to HEAD.
Comment 30 Ævar Arnfjörð Bjarmason 2005-06-21 13:50:46 UTC
Comment on attachment 462 [details]
RAWR!

Marking attachment 462 [details] as obsolete since it won't work for the reasons I stated
in comment #26.
Comment 31 Ævar Arnfjörð Bjarmason 2006-01-03 06:38:51 UTC
This problem also affects parser hooks, like <ref> from the Cite.php extension.
Comment 32 Gabriel Wicke 2006-03-19 14:27:31 UTC
Currently looking into this- i've got a small regexp that fixes the ISBN and
RFC/PMID case, but it doesn't help the math/plugin one. Another case of missing
nesting state, and probably another case of adding a brittle hack.
Comment 33 Gabriel Wicke 2006-03-19 16:25:46 UTC
I've fixed this for stripped things like math and extensions now, unstripping
the $alt text in the parser before constructing the thumb does the trick. All
html tags are removed right after the unstrip, leaving only the text.

This is not committed yet, i'm collecting a bunch of parser fixes for review
sometime next week.
Comment 34 Gabriel Wicke 2006-03-24 17:39:11 UTC
In CVS now, closing as fixed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links