Last modified: 2014-10-15 11:33:32 UTC
It's likely that many Wikipedia accounts have a validated email address that once worked but is out of date. We do not currently unsubscribe users who trigger multiple non-transient failures and some addresses might be 10+ years old. We should not keep sending email that is just going to bounce. It's a waste of resources and might trigger spam heuristics. I'd propose adding two API calls. One to generate a VERP address to use when sending mail from Mediawiki. One that records a non-transient failure. That API call would record the current incident and if there had been some threshold level met, eg at least 3 bounces with the oldest at least 7 days ago, then it would un-confirm the user's address so mail will stop going to it. For at least the second call, authentication will be needed so fake bounces are not a DoS vector or a mechanism for hiding password reset requests. The reason for the threshold is that some failure scenarios will resolve themselves, eg mailbox over quota, so we don't want to react to one bounce. We want a history of consecutive mails bouncing. There would be Mediawiki development component to this task to build the API, to add VERP request calls wherever email is sent, and an Ops component to route VERP bounces to a script (taking the mail as stdin, and optionally e.g. the e-mail address as arguments), which can then call the (authenticated) MediaWiki API method to remove the mail address.
Luke, I think you just volunteered to create the API. :-) When it's ready, I can open a ticket on RT.
Luke, thanks for filing this (see also Brion's bug 12767 comment 2). The issue makes the successful delivery rate of MediaWiki emails lower due to spam countermeasures and may block bug 56414, so I think it should be considered high priority now that we're sending more notifications compared to a few years ago.
(In reply to comment #2) > Luke, thanks for filing this (see also Brion's bug 12767 comment 2). > The issue makes the successful delivery rate of MediaWiki emails lower due to > spam countermeasures and may block bug 56414, so I think it should be > considered high priority now that we're sending more notifications compared > to > a few years ago. Brion, do you think this bug could become a GSoC project? Who could mentor it?
Could be feasible, might want to check in with ops first -- I don't know what it would take to plug into the mail system and properly process bounces. (There's also the danger of fake bounces being used to disable someone's account, so that'll be fun to figure out. :D)
So, this bug report was filed right after a conversation about Echo (RT #4785) in which I was the one to propose VERP, so we've actually ran a full circle now :) Yes, we need to do VERP and we'll make that happen. We're not ready yet for making that change (= making API calls from the mailservers) as the mail infrastructure is about to be rebuilt, but I think catching up again in, say, 1-2 months time would be the right time for this. From the mailserver side, we can run a script (preferrably one that doesn't need a full MediaWiki install on the system ;)) when emails to bounce-XXX addresses arrive. The way to avoid fake bounces DoSing a user would be to use a bounce-<hash>@wikimedia.org return path address with <hash> either being a random, stored token or one that is the output of a symmetrical encryption function, encrypt(email, secret). I'm sure Chris Steipp will have multiple good ideas about that :)
(In reply to Faidon Liambotis from comment #5) > So, this bug report was filed right after a conversation about Echo (RT > #4785) in which I was the one to propose VERP, so we've actually ran a full > circle now :) Hah! Good thing it was still on someone's radar then. Thanks for the reply. > > Yes, we need to do VERP and we'll make that happen. We're not ready yet for > making that change (= making API calls from the mailservers) as the mail > infrastructure is about to be rebuilt, but I think catching up again in, > say, 1-2 months time would be the right time for this. 1-2 months from now would be great for GSoC (it's the deadline for students applications) if someone thinks this suits a GSoC work and unless you want this bug to be fixed *before* the summer by ops/platform. > > From the mailserver side, we can run a script (preferrably one that doesn't > need a full MediaWiki install on the system ;)) when emails to bounce-XXX > addresses arrive. > > The way to avoid fake bounces DoSing a user would be to use a > bounce-<hash>@wikimedia.org return path address with <hash> either being a > random, stored token or one that is the output of a symmetrical encryption > function, encrypt(email, secret). I'm sure Chris Steipp will have multiple > good ideas about that :) Sounds like an endorsement/proposal for Chris to be a (co-)mentor? ;-) Chris, are you interested in soliciting students work in this area? If yes, who could be interested mentoring?
Considering it's been almost a year since we've worked on it, I'm okay with this waiting until a summer GSoC. VERP is one of those reasonable but weird co-ordination issues that falls between the cracks in our organizational structure here. Should we mention this to Quim and co?
If this project requires coding and has mentors, and you want to see it moving forward with a GSoC student, then the time to push for a GSoC proposal is now. Please create an own section for this proposal at https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Featured_project_ideas Thank you!
I have gone through the comments and understood that the mailserver needs to implement VERP (http://en.wikipedia.org/wiki/Variable_envelope_return_path). I would like to work on this feature as a GSoC project, if Nemo, Chris, Faidon or anyone is ready to mentor me. I have fixed a few bugs (https://github.com/wikimedia/mediawiki-core/commits?author=tonythomas01), for MW core and am willing to dedicate my time to get this done.
(In reply to Faidon Liambotis from comment #5) > The way to avoid fake bounces DoSing a user would be to use a > bounce-<hash>@wikimedia.org return path address with <hash> either being a > random, stored token or one that is the output of a symmetrical encryption > function, encrypt(email, secret). I'm sure Chris Steipp will have multiple > good ideas about that :) You would need a random IV, nonce/timestamp (prevent replay), and some sort of checksum (prevent tampering), but yeah, it's doable. (In reply to Nemo from comment #6) > Sounds like an endorsement/proposal for Chris to be a (co-)mentor? ;-) > Chris, are you interested in soliciting students work in this area? If yes, > who could be interested mentoring? Sadly, I probably don't have time to co-mentor this year. I'm fine advising of the design for security, but I've got too many things going on right now.
(In reply to Terry Chay from comment #7) > Considering it's been almost a year since we've worked on it, I'm okay with > this waiting until a summer GSoC. You're "okay with this waiting"? Literally nobody was awaiting your approval. As I understand it, your only involvement with this issue is having hired the guy who filed this bug report (and who rather quickly thereafter left the Wikimedia Foundation...). Good grief.
Please let's focus in what matters here and now: (In reply to Tony Thomas from comment #9) > I have gone through the comments and understood that the mailserver needs to > implement VERP (http://en.wikipedia.org/wiki/Variable_envelope_return_path). > > I would like to work on this feature as a GSoC project, if Nemo, Chris, > Faidon or anyone is ready to mentor me. > > I have fixed a few bugs > (https://github.com/wikimedia/mediawiki-core/commits?author=tonythomas01), > for MW core and am willing to dedicate my time to get this done. Thank you Tony for your interest in fixing this problem. Tony needs two mentors, one familiar with development and one familiar with ops. Terry, Faidon, if you want this becoming a GSoC project proposal, could you please find the right names in your teams? When it comes to outreach programs, it is now or in 6 months (soonest). Thank you.
Thanks Quim, Legoktm have told me he is ready to mentor me in the development part and last day Faidon agreed to help me with the Sys-Ops part. That would mean green to forward right ?
(In reply to Tony Thomas from comment #13) > Thanks Quim, Legoktm have told me he is ready to mentor me in the > development part and last day Faidon agreed to help me with the Sys-Ops > part. That would mean green to forward right ? Just update the "possible projects" page (if you want; it's not mandatory) and submit your application when it's time. As of now WMF is not even officially part of GSoC yet, as far as I remember, but you'll have your official answers in due time.
also see https://rt.wikimedia.org/Ticket/Display.html?id=6933 and Jeff Green volunteered to work with Tony on the ops part of this
MZMcBride: talked to Tony, reset RT pass for him, made sure he can read his own ticket and login, added Bugzilla ticket link, added Jeff Green, added Quim Gil, commented on BZ, hope that helped:)
(In reply to Daniel Zahn from comment #16) Thanks Daniel, MZMcBride. I will talk with Jeff on getting the proposal forward. Since its the time Wikimedia is migrating the mail server to new Data center, it would be the right time to get VERP implemented.
A discussion started regarding VERP scheme in an email thread, seems to make sense to move that discussion to here. So here the gist of the discussion so far. Given a VERP address generally looks something like this: bounce-{$key}@wikimedia.org The prefix /^bounce-/ is used by the incoming MTA as a hook to route messages to the bounce processor, and $key is used by the bounce processor to figure out which wiki user is having delivery issues. We need to prevent an attacker from spoofing bounce messages and causing mass unsubscribes. We can accomplish by making $key secret, and not a simple hash that can be reversed or guessed. "something like an HMAC, with a secret key"
Tony, your proposal is still missing in Google Melange. Please submit it there as a draft linking to your wiki page. In any case, we will evaluate your proposal in mediawiki.org. Thank you!
(In reply to Quim Gil from comment #19) > Tony, your proposal is still missing in Google Melange. Please submit it > there as a draft linking to your wiki page. Done. Public link: http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/tonythomas01/5629499534213120
Labs project mediawiki-verp was created.
Change 138655 had a related patch set uploaded by 01tonythomas: Implementing VERP functionality to alter Return Path https://gerrit.wikimedia.org/r/138655
Change 138655 had a related patch set uploaded by 01tonythomas: Added VERP functionality hook to core https://gerrit.wikimedia.org/r/138655
Change 138655 merged by jenkins-bot: Added VERP functionality hook to core https://gerrit.wikimedia.org/r/138655
No more open patches associated to this bug; resetting status. It seems that most of the work has moved to the BounceHandler extension: https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/BounceHandler+-owner:l10n,n,z If that's where you now want to get this done, please move this to "MediaWiki extensions" product. Otherwise, split from this report to a new bug what's not going to happen in core.
Updates: *) http://deployment.wikimedia.beta.wmflabs.org/wiki/Special:EmailUser sending VERPed emails after https://gerrit.wikimedia.org/r/#/c/141287/. *) Bouncehandler router and transport added to beta/prod to handle incoming bounces, after patch https://gerrit.wikimedia.org/r/#/c/155753/. The new router fetches and routes all VERP bounces via 2 steps of checks ( a regex for the VERP pattern, and a check on the domain ), and POST to the 'bouncehandler' API in beta. *) Currently, the configuration is suspended in prod, as we need to test at-least for 2 weeks watching table 'bounce_records' in beta, where the extension is installed, and bounces are handled properly. We plan to take it to loginwiki in the next level - as deploying into prod.