Last modified: 2005-07-23 04:38:19 UTC
Ext. links with forms such as "http:/foo" and "ftp:/foo" (rather than "http://foo" and "ftp://foo") are currently treated as valid by Parser.php, but then not styled by the CSS (i.e. they show up light blue in monobook, but receive no icon). The relevant RFCs appear to define the double-slash as mandatory (e.g. ftp://ftp.rfc-editor.org/in-notes/rfc1738.txt section 3.1), so the parser should probably simply pass over these as invalid syntax. Of course, things like "mailto:" will *never* have a double-slash, so it needs to be a scheme-specific check. Perhaps it should be treated as part of the scheme, so the list (configurable, per bug 431) would be something like $wgValidURISchemes=array('http://', 'ftp://', 'mailto:') etc.
The parser handle the "http:/foo" things correctly and even build a correct link. This way reader will at least be able to follow the link. The fact that the icon is missing is good way to alert the reader that something is wrong an need some minor tweaking. Let it as this.
I'm sorry, but I disagree. I'm happy to have this marked as a low-severity, low-priority bug, but a bug it is. The current behaviour does *not* handle such links "correctly", by any definition of "correctness" we choose. * If we want to follow standards, "http:/example.com" should *not* be treated as a valid URL, and therefore should *not* generate a clickable link; if there's no link at all, somebody will soon spot that something needs fixing. * If we want to ignore the standards, and go for an easy life, we can continue to create clickable links for such invalid URLs, and hope that browsers generally treat them sanely. In that case, however, we need to fix the CSS to use the same rule as the parser, so we don't get these anomalously un-labelled links. Saying that this gives a clue that something needs fixing is, frankly, rather silly; either the links are valid, and should be displayed properly, or they are invalid, and should not be rendered at all. It's not like it would even be a complex fix, especially if someone were playing with that code anyway, for bug 431 (like I say, just make the protocols array contain "http://", "mailto:", etc.).
So I will prefer that the parser stop rendering URL with only one / . That will be even better to trigger the error ;o)
(In reply to comment #3) > So I will prefer that the parser stop rendering URL with only one / . > That will be even better to trigger the error ;o) I'm not sure if you have misunderstood me, or if I am now misunderstanding you, but just to clarify: not rendering URLs with only one '/' would in fact be correct behaviour (in accordance with the definitions of valid URLs in the relevant RFCs). It would also make the error easier to spot than having the PHP treat it as OK, but the CSS treat it as 'something unknown', which is perhaps what you meant. As I say, either way, our code should treat such links *consistently*.
I added a parsertest for this in REL1_4 and HEAD.
parser engine got a trouble cause http:foobar is atm considered valid probably to handle the mailto:foo@b.ar case. Need to rewrite some pieces so that the protocols are defined with the // Eg: define( 'URL_PROTOCOLS', 'http|https|ftp|irc|gopher|news|mailto' ); define( 'HTTP_PROTOCOLS', 'http|https' ); Should become something like: define( 'URL_PROTOCOLS', 'http:\/\/|https:\/\/|ftp:\/\/|irc:\/\/|gopher:\/\/|news:\/\/|mailto:' ); define( 'HTTP_PROTOCOLS', 'http:\/\/|https:\/\/' );
> define( 'URL_PROTOCOLS', > 'http:\/\/|https:\/\/|ftp:\/\/|irc:\/\/|gopher:\/\/|news:\/\/|mailto:' ); Or, as suggested in bug 431, make the whole thing configurable, as $wgUrlProtocols or some such. BTW, do you really need to escape all those slashes? What would be the side-effect of having the following, much more pleasant definition? $wgUrlProtocols='http://|https://|ftp://|irc://|gopher://|news:|mailto:'; [oh, and note that 'news:' doesn't have a '//']
(In reply to comment #7) > BTW, do you really need to escape all those > slashes? Sorry, I was being dim there - '/' would mark the end of the regex; however, PHP uses the same regex system as Perl, so you can actually choose your own regex delimiters. Thus my regex fragment would work, as long as any regex built from used, say, "%regex%" instead of "/regex/".
> Sorry, I was being dim there - '/' would mark the end of the regex; That's why I keep escaping slashes. We could use another delimiter though. http://twenkill.dyndns.org/wiki/787 http://test.leuksman.com/index.php/787 I fixed the bug. Malformed URLs are no more converted.
Fixed the fix that stopped autonumbering :o)
There is some broken links floating around which need to be fixed now.