Last modified: 2008-03-25 05:22:18 UTC
Try writing “[[Image:Example.png|thumb|1000px|Some caption ending with px]]”.
You will receive a default-sized thumbnail without a caption, because the title ending with “px” is interpreted as a width parameter, even though it is invalid (the text in place of the expected number is interpreted as 0, overriding the previous valid 1000px parameter, and rendering at default width), and even though the width has already been specified.
The same is true for captions starting with “upright”, or (for djvu files only) with “page”, because those are the magic words containing “$1” placeholders. Note that while this might not be a big problem in English (who ends a title with “px” or starts a title with a lowercase letter?), it can be a bigger issue for some other languages with different magic words (it happened to me on w:cs: today when trying to write a normal image caption).
The problem is that MagicWord::matchVariableStartToEnd eats anything (.*? – see MagicWord::initRegex), no matter only a number is expected, and the parser does no checks on the extracted value, either. The simplest solution could be to test $value after the “isset( $paramMap[$magicName] )” condition against some basic constraints (at least a simple “&& !preg_match( '/[^0-9.]/', $value )” might do; what parameters are possible there, anyway?).
Created attachment 4730 [details]
Patch, simple addition
(In reply to comment #1)
> Patch, simple addition
Well, that is _almost_ exactly what I wrote, isn’t it? And your patch omits the decimal dot I wrote, which is used with upright (upright=1.2). (Anyway, I am not completely sure this is the best solution, I considered this a bit hacky.)
+ adding patch keyword
Created attachment 4732 [details]
Addition of a dot to the regex. Now works with more options
This will break dubious but currently-tolerated constructs like "1000 px" (unless I missed a trim() somewhere, which is probable). More pedantically, it will also break constructs like "upright=1.35E2" and "+750px". That's probably okay: this is going to have to be a heuristic and therefore messy, but it's worth pointing that out. Overall, though, it might be simpler to just use is_numeric instead of rolling your own regex. Did you test the patch?
Yes. Fully tested.
Committed a version using is_numeric in r32391.
The patch was incorrect, see my fix in r32394.