Last modified: 2009-03-18 22:56:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T19680, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 17680 - colon sometimes URL encoded "%3A" right there in the browser URL entry area
colon sometimes URL encoded "%3A" right there in the browser URL entry area
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
Special pages (Other open bugs)
1.15.x
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
http://zh.wikipedia.org/wiki/Special:...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-26 17:07 UTC by Dan Jacobson
Modified: 2009-03-18 22:56 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dan Jacobson 2009-02-26 17:07:17 UTC
Lets say I'm on Special:Allpages,
http://radioscanningtw.jidanni.org/index.php?title=特殊:所有页面
And on the pop down menu, I click Talk:
http://radioscanningtw.jidanni.org/index.php?title=特殊%3A所有页面&from=保全&to=高雄縣警察局鳳山分局&namespace=1
Well, why suddenly must the colon become "%3A"?
Comment 1 Dan Jacobson 2009-02-28 20:24:56 UTC
Related to #17681, #17712.
Comment 2 Dan Jacobson 2009-02-28 20:25:41 UTC
Related to Bug #17681, Bug #17712.
Comment 3 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-03-01 00:31:59 UTC
This is because people are using urlencode() instead of wfUrlencode(), probably.  It should be easy enough to fix on a case-by-case basis.
Comment 4 Dan Jacobson 2009-03-14 02:05:57 UTC
I'm sorry but I cannot find even the one spot in the code where this is doing this.
The urlencode()s I saw act on what comes after the colon.
I have replaced the URL with zh.wikipedia's for you to test easier.
Comment 6 Dan Jacobson 2009-03-16 03:46:06 UTC
Please confirm this is a Firefox bug.
Visit http://transgender-taiwan.org/index.php?title=Special:Allpages
Now hit the Go [提交] button.

Why is there the ugly %3A instead of the colon in Firefox:
http://transgender-taiwan.org/index.php?title=特殊%3A所有頁面&from=English&to=首頁&namespace=0
http://transgender-taiwan.org/index.php?title=特殊:__所有頁面&from=English&to=首頁&namespace=0
vs. the latter (lined up by me) seen in emacs-w3m. What does IE show?

If it is a Firefox bug, somebody please report it, because I can't get a word in edgewise there.
Comment 7 Roan Kattouw 2009-03-17 15:33:31 UTC
(In reply to comment #6)
> Please confirm this is a Firefox bug.
> Visit http://transgender-taiwan.org/index.php?title=Special:Allpages
> Now hit the Go [提交] button.
> 
> Why is there the ugly %3A instead of the colon in Firefox:
> http://transgender-taiwan.org/index.php?title=特殊%3A所有頁面&from=English&to=首頁&namespace=0
> http://transgender-taiwan.org/index.php?title=特殊:__所有頁面&from=English&to=首頁&namespace=0
> vs. the latter (lined up by me) seen in emacs-w3m. What does IE show?
> 
> If it is a Firefox bug, somebody please report it, because I can't get a word
> in edgewise there.
> 

This happens in IE for me as well; even worse, IE converts all the Chinese (?) characters to %xx pairs while Firefox just encodes the : to %3A. Seems to be a MW bug, I'll see if I can track it down.
Comment 8 Roan Kattouw 2009-03-17 16:09:36 UTC
Doesn't seem to be a MW bug after all. The HTML for the quick search form is:
<form action="/t/index.php" id="searchform"><div>
				<input type='hidden' name="title" value="Special:Search"/>
...
So MW's not at fault here.
Comment 9 Dan Jacobson 2009-03-17 21:03:37 UTC
Testing http://zh.wikipedia.org/wiki/Special:Allpages :
Firefox, and emacs-w3m show
http://zh.wikipedia.org/wiki/Special:所有页面
but then hitting the [提交] button gives:
Firefox:  http://zh.wikipedia.org/w/index.php?title=Special%3A所有页面&from=&to=&namespace=0
emacs-w3m http://zh.wikipedia.org/w/index.php?title=Special:所有页面&from=&to=&namespace=0

The weird thing is one can do
$ firefox http://zh.wikipedia.org/w/index.php?title=Special:所有页面&from=&to=&namespace=0
and the colon stays put.

(Midori acts like you mention IE does.)
Comment 10 Roan Kattouw 2009-03-17 21:57:06 UTC
I suspect this is related to the fact that the search form uses a POST request. Either way, it's a browser bug (a widespread one, it seems), not a MediaWiki bug, so any discussion about it shouldn't happen here.
Comment 11 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-03-17 21:59:54 UTC
It's not a *bug*, it just looks a little ugly.  The two URLs in question are equivalent, there's no standard that requires the browser to prefer one to the other for display.
Comment 12 John Mark Vandenberg 2009-03-17 22:23:09 UTC
":" is a reserved character and will be percent-encoded by any browser that
follows the specification.  See http://tools.ietf.org/html/rfc3986#section-2.2
and [[Percent-encoding#Types_of_URI_characters]]
Comment 13 Dan Jacobson 2009-03-17 22:37:58 UTC
OK, but what about e.g.,
http://en.wikipedia.org/w/index.php?title=User_talk:Jidanni&action=history
Here the colon is also in the query string, but do we ever see it end
up as %3A in any browser's URL bar, unless we type in in there
ourselves?
Comment 14 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-03-18 11:23:51 UTC
(In reply to comment #12)
> ":" is a reserved character and will be percent-encoded by any browser that
> follows the specification.  See http://tools.ietf.org/html/rfc3986#section-2.2
> and [[Percent-encoding#Types_of_URI_characters]]

If anything, to the contrary, agents are not supposed to encode or decode colons, specifically because they're reserved:

   URIs that differ in the replacement of a reserved character with its
   corresponding percent-encoded octet are not equivalent.  Percent-
   encoding a reserved character, or decoding a percent-encoded octet
   that corresponds to a reserved character, will change how the URI is
   interpreted by most applications.

If : is indeed considered reserved here, then encoding it would change the meaning of the URL.  Compare:

   URI producing applications should percent-encode data octets that
   correspond to characters in the reserved set unless these characters
   are specifically allowed by the URI scheme to represent data in that
   component.

Firefox is not a "URI producing application" here -- it's just reproducing the URI provided by MediaWiki, and so this doesn't apply to it.  It might apply to MediaWiki (which is why I'm addressing this argument at all), but a) it says "should", so we can always ignore it if it seems to be safe and makes the URLs prettier ;), and b) there's the exception "unless these characters are specifically allowed by the URI scheme to represent data in that component."

If we consult RFC 2616, which defines the http: scheme, we find in sections 3.2.1 and 3.2.2[1] that its production for the path part of the URI is that of abs_path from RFC 2396.  If we look there, we find[2] that an abs_path can contain any pchar, with pchar being defined as

      pchar         = unreserved | escaped |
                      ":" | "@" | "&" | "=" | "+" | "$" | ","

Therefore I conclude that in "http:" URIs specifically, colons are not reserved in the path part, and it's perfectly legitimate for us to emit them unencoded, and for clients to encode and decode them freely (which is what Firefox seems to do).

[1] http://tools.ietf.org/html/rfc2616#section-3.2.1
[2] http://tools.ietf.org/html/rfc2396 (search for abs_path and follow the productions)

(In reply to comment #13)
> OK, but what about e.g.,
> http://en.wikipedia.org/w/index.php?title=User_talk:Jidanni&action=history
> Here the colon is also in the query string, but do we ever see it end
> up as %3A in any browser's URL bar, unless we type in in there
> ourselves?

This is not a MediaWiki problem.  Complain to Mozilla.
Comment 15 Dan Jacobson 2009-03-18 20:39:35 UTC
> This is not a MediaWiki problem.  Complain to Mozilla.
"Dear Mozilla, you made my colon pretty. I want it ugly." ????

I don't think you understand me.

I'm trying to say that I don't like %3A's when they could just be :'s.

I am still curious:
Could MediaWiki be adjusted to stop the phenomenon?


Comment 16 Roan Kattouw 2009-03-18 22:56:58 UTC
(In reply to comment #15)
> > This is not a MediaWiki problem.  Complain to Mozilla.
> "Dear Mozilla, you made my colon pretty. I want it ugly." ????
> 
> I don't think you understand me.
> 
> I'm trying to say that I don't like %3A's when they could just be :'s.
> 
> I am still curious:
> Could MediaWiki be adjusted to stop the phenomenon?
> 

As people have repeatedly said on this bug: the problem is not with MediaWiki, and no, there is nothing MediaWiki could do to stop browsers from mangling colons. All it could really do is serve colons rather than %3A's in links and forms, and it's already doing that.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links