Last modified: 2013-04-08 17:32:15 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36939, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34939 - url parsing does not recognize mixed case protocols
url parsing does not recognize mixed case protocols
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.19
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: easy, need-parsertest
Depends on: 34956
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-03 16:49 UTC by Liangent
Modified: 2013-04-08 17:32 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Liangent 2012-03-03 16:49:19 UTC
But it works if I type it in address bar.
Comment 1 Antoine "hashar" Musso (WMF) 2012-03-04 13:41:24 UTC
Original summary:

[HttP://en.WikiPedia.org/Wiki/page_name] is not recognized as a valid external link
Comment 2 Antoine "hashar" Musso (WMF) 2012-03-04 13:42:37 UTC
We have wfUrlProtocols() / mUrlProtocols to list all possible protocol.  Whenever we use them, maybe we should make the regex ignore case with the i modifier.
Comment 3 Antoine "hashar" Musso (WMF) 2012-03-04 13:43:36 UTC
Please note we probably want to remove all wfUrlProtocols() call from the parser excepting the one in the constructor. See bug 34956
Comment 4 Fran Rogers 2012-07-10 20:15:53 UTC
I've submitted a patch that should make all URL scheme handling case-insensitive.

https://gerrit.wikimedia.org/r/#/c/15224/
Comment 5 Mark Holmquist 2012-09-04 14:30:45 UTC
Patch merged, we should be good now! Thanks all.
Comment 6 Tyler Romeo 2012-09-04 15:54:05 UTC
It should be noted that this patch violates RFC 3986, which states that applications should accept mixed case schemes as a matter of robustness, but only output lower-case schemes in the actual output.
Comment 7 Mark Holmquist 2012-09-04 15:58:36 UTC
Tyler, maybe we should open a new bug (or bug and tracking bug) for RFC 3986 compliance? I don't pretend to know how much effort it would take, but it seems like a separate issue.
Comment 8 MZMcBride 2012-09-05 00:41:45 UTC
(In reply to comment #7)
> Tyler, maybe we should open a new bug (or bug and tracking bug) for RFC 3986
> compliance? I don't pretend to know how much effort it would take, but it seems
> like a separate issue.

Yep, sounds good to me. Thanks for your work on this bug!

(In reply to comment #6)
> It should be noted that this patch violates RFC 3986, which states that
> applications should accept mixed case schemes as a matter of robustness, but
> only output lower-case schemes in the actual output.

Kind of trippy that the RFC would specify that output should be case sensitive while also calling for every input to be case insensitive. That is, no browser is going to ever get confused by href="Http://...". Anyway, if it's a real concern, a separate bug (or separate bugs) should be filed.
Comment 9 Liangent 2012-09-05 01:55:44 UTC
(In reply to comment #6)
> It should be noted that this patch violates RFC 3986, which states that
> applications should accept mixed case schemes as a matter of robustness, but
> only output lower-case schemes in the actual output.

Does this mean MediaWiki should show the following links in the same way?

* HttP://www.google.com/
* http://www.google.com/
* [HttP://www.google.com/ http://www.google.com/]
* [http://www.google.com/ http://www.google.com/]
Comment 10 Tyler Romeo 2012-09-05 22:20:22 UTC
(In reply to comment #9)
> (In reply to comment #6)
> > It should be noted that this patch violates RFC 3986, which states that
> > applications should accept mixed case schemes as a matter of robustness, but
> > only output lower-case schemes in the actual output.
> 
> Does this mean MediaWiki should show the following links in the same way?
> 
> * HttP://www.google.com/
> * http://www.google.com/
> * [HttP://www.google.com/ http://www.google.com/]
> * [http://www.google.com/ http://www.google.com/]

Yes, HTTP, HttP, HTtp, etc. should all be rendered in the canonical form of "http" in the actual links. I'll open a separate bug for this when I have time. Should be an easy fix.
Comment 11 Dan Jacobson 2012-09-28 18:07:10 UTC
Please fix the spelling error in RELEASE-NOTES-1.20:
* (bug 34939) made link parsking insensitive ([HttP://])
                        ^^^^^^^^
Comment 12 Sam Reed (reedy) 2012-09-29 00:10:25 UTC
(In reply to comment #11)
> Please fix the spelling error in RELEASE-NOTES-1.20:
> * (bug 34939) made link parsking insensitive ([HttP://])
>                         ^^^^^^^^

Already fixed on the 12th September

https://gerrit.wikimedia.org/r/#/c/23496/
Comment 13 Dan Jacobson 2012-09-29 00:22:09 UTC
All I know is that is not in RELEASE-NOTES-1.20 .
Maybe you mean it will eventually show up there.
Comment 14 Krinkle 2012-09-29 01:18:57 UTC
It is in there, make sure your clone is up to date (or use the online viewer):

https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=RELEASE-NOTES-1.20;hb=HEAD
Comment 15 Dan Jacobson 2012-09-29 01:38:18 UTC
You are right.

The problem is due to how I check what got changed.


        git fetch origin
        git diff master..origin/master \
            RELEASE-NOTES-* includes/installer/LocalSettingsGenerator.php|wdiff -d -3 > /tmp/mediawikiDiff$$||:
        ${PAGER-less} /tmp/mediawikiDiff$$


It somehow shows me old stuff.

But I don't know how to improve it.
Comment 16 Antoine "hashar" Musso (WMF) 2013-04-08 17:32:15 UTC
Also solved the duplicate bug 27913

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links