Last modified: 2014-11-18 18:07:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 11547 - Use only ASCII characters in email confirmation links
Use only ASCII characters in email confirmation links
Status: RESOLVED DUPLICATE of bug 6957
Product: MediaWiki
Classification: Unclassified
Email (Other open bugs)
unspecified
All All
: Normal trivial with 9 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n, utf8
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-10-02 23:24 UTC by Tisza Gergő
Modified: 2014-11-18 18:07 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tisza Gergő 2007-10-02 23:24:04 UTC
Localized versions of Special:Confirmemail might contain non-ascii characters, and not all email clients and/or browsers handle such characters reliably. When this happens, the link will open a new article in the main namespace, and the user will be unable to register (and even might up writing junk articles while trying to do so). Thus either use the unlocalized name of Special:Confirmemail in the link in the email, or use proper urlencoding. (The latter seems less reliable, because a misconfigured client may still decode it, and the browser interpret it as something else than UTF-8.)
Comment 1 Tisza Gergő 2007-11-23 16:51:09 UTC
Got another mail from a user unable to register his email just now. Please fix this; it should be trivial.
Comment 2 Henry Edward Hardy 2008-04-18 19:05:44 UTC
Repost from OLPC rt bug 1632:

I tried webmaster, but that didn't work.  The confirmation message came from 
you so I'll try that.

I gave the wiki signup screen an email address of:
  hgm+olpc@ip-64-139-1-69.sjc.megapath.net
It sent the confirmation to:
  olpc@ip-64-139-1-69.sjc.megapath.net

That name might have been too long, but I expect some parser chopped things 
off after the "+" rather than understanding that it's a valid character in 
email names.

I also tried "-" rather than "+".  Same results.

A warning message might avoid some confusion.  Most people probably won't 
know enough (or have access to) their mail server's log files.

I assume you are familiar with using "+" to make tagged addresses.  If not, 
I'll say more.
 


-- 
These are my opinions, not necessarily my employer's.  I hate spam.

Comment 3 Philippe Verdy 2009-11-14 17:43:32 UTC
URL encoding is definitely NOT the correct way to make the "user@" part of emails address valid. Read the RFCs:
URL encoding just applies to the hierarchical page name within a domain space (and under a hierarchical protocol like "http(s):" and "ftp(s):"),
as well as in query parameters (when they are supported in those protocols).
 
Valid user names in email addresses also use a "safe" alphabet different from that for domain names (which also DO NOT use URL encoding but the encodings supported in IDNA, if they are internationized, and DNS specifications otherwise).

For example, the underscore character "_" (which is part of my own email address and cannot be subtituted into a "+" or "-" and not even into "%7E") or the exclamation punctuation mark "!" is perfectly safe (and standard) in the "user@" part (which in fact is not really described as a user name, but as an identity specifier whose internal syntax may contain a user name and some other authorization data, that cannot be safely stripped out or separated (some sites will use the colon ":" instead of the exclamation mark).

Mapping any Unicode characters with UTF-8 or other representations into a valid "user@" part of an email address is completely unspecified (there's absolutely no reliable algorithm to do this, as the mapping is completely domain-dependant and may even be different from the mapping used for encoding usernames in URI schemes other than "mailto:"). All that can be done is to check that the "user@" part provided uses the valid ASCII subset which is specific to the "mailto:" URI scheme (and distinct from the ASCII subsets used: either in the DNS protocol for domain names; or in the server-local address part of HTTP/FTP URLs).

Note also that "user@" parts in email addresses are normally CASE-SIGNIFICANT (even if most target SMTP servers, will accept emails using any case, and if some RFCs require that users provide an email address containing a user name that can be used as a valid label in a DNS subdomain, in order to activate some functionality) ; STMP relay agents (as well as senders) MUST NOT change the letter case in a pseudo-canonicalization (because they can't realiably know if the recipient server makes the case distinction) : this could simply break the authorization data which is part of the "user@" part (for example it could contain Base64-encoded binary data, in addition to representing the user identity on the target server where it will be delivered to the target POP3/IMAP/WebMail user's mailbox).
Comment 4 Tisza Gergő 2009-11-14 17:54:55 UTC
(In reply to comment #2)
(In reply to comment #3)

You should probably open a new bug to discuss that; this one is about the lack of urlencoding in the confirmation link, which is a wholly unrelated issue.
Comment 5 Karun 2009-12-20 22:36:27 UTC
I do not think we should be using ASCII only. Rather we should use UTF-8 due to Mediawiki needing to support more than just english.
Comment 6 Brion Vibber 2009-12-20 22:44:23 UTC
Please do not abuse the bug tracking system by changing the summary to subvert the entire point of a bug.
Comment 7 Karun 2009-12-20 22:53:50 UTC
This looks like a upstream problem, if browsers and email clients cannot support characters.
What browsers and email clients does this occur with?
Comment 8 Alex Z. 2009-12-21 05:32:21 UTC
AFAICT, the actual issue behind this bug was fixed way back in r35505, and this is actually a dupe of bug 6957

*** This bug has been marked as a duplicate of bug 6957 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links