Last modified: 2014-07-24 13:13:28 UTC
https://www.mediawiki.org/wiki/Extension:Petition Note that this only needs to be deployed on https://wikimediafoundation.org not any other Wikimedia sites.
Assigning to Ori. Ori: feel free to punt to someone you pick.
A few questions: a) How many rows per petition name might we aspect roughly b) Ditto per rows in the whole table (the CSV dumps this) c) Will the pages with the tags to include these be high-traffic (they are uncached and do a COUNT(*) each time? If yes, it might help to cache the count or even use a Redis hyperloglog (provided there was some de-duplication) d) Actually, why is there no de-duplication? Right now someone can keep signing it with similar names. Maybe some criteria could be established for that. That would bound how many rows one person could put into the table. e) Maybe some IP restrictions and rate-limiting could be applied. If there is no de-duplication, then that at least bounds the rate that someone can spam rows into the table (which should be done anyway). I'd suggest using User::pingLimiter. Limits must also be set in wmf-config in a separate patch. f) Relatedly, should blocked users be able to sign petitions? I see there is a $wgPetitionDatabase. It would be nice if there was a $wgPetitionCluster so it could use the same DB that Flow uses.
a) Hard to predict, but we should be prepared for tens, maybe hundreds of thousands (I hope!) b) Pretty much the same as a). We only have this one petition planned, and I don't think we'll be doing any others. c) Yes, quite high traffic. I'm going to look into ways to cache the count. d) This would add quite some complexity and overhead. I think e) is a better solution, especially if we have ways of doing that in MediaWiki already. Will look into this. f) I don't see why not. We're deliberately not showing the "personal messages" anywhere without manual intervention, so that won't be an avenue for trolling. Also this deployment is going to be on the restricted wikimediafoundation.org, so blocked users shouldn't be a problem there.
a/b) Sounds about like what I expected, OK c/d) I guess you can denormalize the count in the DB (maintaining atomicity) or in memcached to (INCR on INSERT, refill from COUNT(*) if the key is missing). As for varnish/parser cache, maybe $parser->getOuput()->updateCacheExpiry() can be explored (using a low TTL) instead of just disableCache(). Another option would be a moderate or regular TTL but with cache purging done on post. Since the POST is always on the title with the form, you can assume that the page with that title needs to be purged; we can assume no pages transclude pages that have petitions/counts I assume (even if there could be one could just push an HTMLCacheUpdate job anyway). f) As long is good rate limiting is in place and the list is not easily seen it doesn't matter. Who will have the right to view the data? Depending on how open that it is, there is the matter of oversight/suppression (though I guess that's a topic for the security review).
The data will only be viewable by a few WMF employees (probably within Fundraising where we've all already signed a strict "Protection of Donor and User Information" agreement)
Change 144174 had a related patch set uploaded by Pcoombe: Store number of signatures count in memcached https://gerrit.wikimedia.org/r/144174
I believe all the issues here have been addressed. As suggested by Chad on https://bugzilla.wikimedia.org/show_bug.cgi?id=65849, have changed from using a parser tag to an includable special page, so disableCache() is no longer used. There are patches awaiting review for storing the number of signatures in memcached (https://gerrit.wikimedia.org/r/144174), rate limiting (https://gerrit.wikimedia.org/r/144074), and preventing blocked users from signing (https://gerrit.wikimedia.org/r/#/c/142737/)
Change 144174 merged by jenkins-bot: Store number of signatures count in memcached https://gerrit.wikimedia.org/r/144174