Last modified: 2014-11-05 18:49:24 UTC
Sub part of Bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=69019 Quoting: https://bugzilla.wikimedia.org/show_bug.cgi?id=69019#c0 >> The BounceHandler extension generates a unique VERP address on every sent email, and has the bouncehandler API that can handle incoming email bounces once the bounce is HTTP POSTed to it via curl from exim. The extension was built as part of the VERP project to properly handle email bounces. Pipe incoming bounces to the bouncehandler API:- You need to add the following to your incoming mails to *@wikimedia.com receving PIPE transport: command = /usr/bin/curl "action=bouncehandler" --data-urlencode "email@-" http://$IP/api.php The genrated VERP address is of the form $wikiId-base36( $UserID )-base36( $Timestamp )-hash( $algorithm,$key, $prefix )@$email_domain where $wikiId - The wiki database name, to support multiple wikis $userID - The user_id from table 'user' - to uniquely identify a recipient. $Timestamp - The unix timestamp. $prefix = $wikiId. '-'. base_convert( $uid, 10, 36). '-'. base_convert( $timeNow, 10, 36); It use the Plancake mail parser external library to extract the headers as an addition to the optional inbuilt regex functions. Gerrit: https://github.com/wikimedia/mediawiki-extensions-BounceHandler
The extension is currently in beta, and in the testing phase. Can we give this a bit speed-up ? You can test a sample mail at deployment.wikimedia.beta.wmflabs.org/wiki/Special:EmailUser.
Ori/Aaron have decided to stop doing pre-deployment performance reviews. I've unblocked this from the deployment bug and moved it to the BounceHandler component (feel free to close or leave open if you ever do want a perf review).
Some bits: * getOriginalEmail() doesn't call reuseConnection() and the return value is not well defined * processBounceHeaders()/handleFailingRecipient() doesn't call reuseConnection() * The query in handleFailingRecipient() is unindexed, how large is the table expected to get? * Also, it would be nice of the table could go in extension-1 (e.g. a custom cluster). Echo and Flow do this. Where is the code for the broker that takes in the bounces and does a POST to the API?
(In reply to Aaron Schulz from comment #3) > Where is the code for the broker that takes in the bounces and does a POST > to the API? Here is the code that would go to polonium, that would HTTP POST to the API https://github.com/wikimedia/operations-puppet/blob/production/templates/exim/exim4.conf.SMTP_IMAP_MM.erb#L670 We will have it point to en-wiki after this change - https://gerrit.wikimedia.org/r/#/c/168622/
Change 169654 had a related patch set uploaded by 01tonythomas: Handle the return value of getOriginalEmail efficiently https://gerrit.wikimedia.org/r/169654
Change 169654 merged by jenkins-bot: Various performance fixed for the BounceHandler extension https://gerrit.wikimedia.org/r/169654
(In reply to Aaron Schulz from comment #3) > * The query in handleFailingRecipient() is unindexed, how large is the table > expected to get? I was about to file a bug about that I guess it should probably be a partial index on the email field, and then the timestamp. Something like: CREATE INDEX /*i*/br_mail_timestamp ON /*_*/user (br_user_email(50), br_timestamp); In core we only index the first 50 chars of the email on the user table
I'd second having an index as above.
(In reply to Sam Reed (reedy) from comment #7) > CREATE INDEX /*i*/br_mail_timestamp ON /*_*/user (br_user_email(50), > br_timestamp); Using the right table would help :) CREATE INDEX /*i*/br_mail_timestamp ON /*_*/bounce_records(br_user_email(50), br_timestamp);
Change 170759 had a related patch set uploaded by 01tonythomas: Create table index for 'bounce_records' table https://gerrit.wikimedia.org/r/170759
Change 170759 merged by jenkins-bot: Create table index for 'bounce_records' table https://gerrit.wikimedia.org/r/170759
Change 170835 had a related patch set uploaded by Aaron Schulz: Added wgBounceHandlerCluster setting https://gerrit.wikimedia.org/r/170835
Change 170835 merged by jenkins-bot: Added wgBounceHandlerCluster setting https://gerrit.wikimedia.org/r/170835
LGTM