Last modified: 2014-07-24 13:13:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T67851, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 65851 - Performance review of Petition extension before WMF deployment
Performance review of Petition extension before WMF deployment
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Petition (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Ori Livneh
:
Depends on:
Blocks: 65849
  Show dependency treegraph
 
Reported: 2014-05-28 12:38 UTC by Peter Coombe
Modified: 2014-07-24 13:13 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Peter Coombe 2014-05-28 12:38:58 UTC
https://www.mediawiki.org/wiki/Extension:Petition

Note that this only needs to be deployed on https://wikimediafoundation.org not any other Wikimedia sites.
Comment 1 Greg Grossmeier 2014-05-28 15:23:30 UTC
Assigning to Ori. Ori: feel free to punt to someone you pick.
Comment 2 Aaron Schulz 2014-05-28 21:52:37 UTC
A few questions:

a) How many rows per petition name might we aspect roughly

b) Ditto per rows in the whole table (the CSV dumps this)

c) Will the pages with the tags to include these be high-traffic (they are uncached and do a COUNT(*) each time? If yes, it might help to cache the count or even use a Redis hyperloglog (provided there was some de-duplication)

d) Actually, why is there no de-duplication? Right now someone can keep signing it with similar names. Maybe some criteria could be established for that. That would bound how many rows one person could put into the table.

e) Maybe some IP restrictions and rate-limiting could be applied. If there is no de-duplication, then that at least bounds the rate that someone can spam rows into the table (which should be done anyway). I'd suggest using User::pingLimiter. Limits must also be set in wmf-config in a separate patch.

f) Relatedly, should blocked users be able to sign petitions?

I see there is a $wgPetitionDatabase. It would be nice if there was a $wgPetitionCluster so it could use the same DB that Flow uses.
Comment 3 Peter Coombe 2014-05-29 10:22:21 UTC
a) Hard to predict, but we should be prepared for tens, maybe hundreds of thousands (I hope!)

b) Pretty much the same as a). We only have this one petition planned, and I don't think we'll be doing any others.

c) Yes, quite high traffic. I'm going to look into ways to cache the count.

d) This would add quite some complexity and overhead. I think e) is a better solution, especially if we have ways of doing that in MediaWiki already. Will look into this.

f) I don't see why not. We're deliberately not showing the "personal messages" anywhere without manual intervention, so that won't be an avenue for trolling. Also this deployment is going to be on the restricted wikimediafoundation.org, so blocked users shouldn't be a problem there.
Comment 4 Aaron Schulz 2014-05-29 19:40:40 UTC
a/b) Sounds about like what I expected, OK

c/d) I guess you can denormalize the count in the DB (maintaining atomicity) or in memcached to (INCR on INSERT, refill from COUNT(*) if the key is missing). 

As for varnish/parser cache, maybe $parser->getOuput()->updateCacheExpiry() can be explored (using a low TTL) instead of just disableCache(). Another option would be a moderate or regular TTL but with cache purging done on post. Since the POST is always on the title with the form, you can assume that the page with that title needs to be purged; we can assume no pages transclude pages that have petitions/counts I assume (even if there could be one could just push an HTMLCacheUpdate job anyway).

f) As long is good rate limiting is in place and the list is not easily seen it doesn't matter. Who will have the right to view the data? Depending on how open that it is, there is the matter of oversight/suppression (though I guess that's a topic for the security review).
Comment 5 Peter Coombe 2014-05-30 14:28:51 UTC
The data will only be viewable by a few WMF employees (probably within Fundraising where we've all already signed a strict "Protection of Donor and User Information" agreement)
Comment 6 Gerrit Notification Bot 2014-07-04 18:24:16 UTC
Change 144174 had a related patch set uploaded by Pcoombe:
Store number of signatures count in memcached

https://gerrit.wikimedia.org/r/144174
Comment 7 Peter Coombe 2014-07-10 21:33:19 UTC
I believe all the issues here have been addressed. As suggested by Chad on https://bugzilla.wikimedia.org/show_bug.cgi?id=65849, have changed from using a parser tag to an includable special page, so disableCache() is no longer used.

There are patches awaiting review for storing the number of signatures in memcached (https://gerrit.wikimedia.org/r/144174), rate limiting (https://gerrit.wikimedia.org/r/144074), and preventing blocked users from signing (https://gerrit.wikimedia.org/r/#/c/142737/)
Comment 8 Gerrit Notification Bot 2014-07-15 23:37:20 UTC
Change 144174 merged by jenkins-bot:
Store number of signatures count in memcached

https://gerrit.wikimedia.org/r/144174

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links