Last modified: 2014-03-30 04:01:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T51872, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 49872 - max_user_connections too low on Wikimedia Labs
max_user_connections too low on Wikimedia Labs
Status: RESOLVED FIXED
Product: Wikimedia Labs
Classification: Unclassified
tools (Other open bugs)
unspecified
All All
: Normal blocker
: ---
Assigned To: Sean Pringle
:
Depends on:
Blocks: labs-replication 58798
  Show dependency treegraph
 
Reported: 2013-06-20 12:40 UTC by Johannes Kroll (WMDE)
Modified: 2014-03-30 04:01 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Johannes Kroll (WMDE) 2013-06-20 12:40:29 UTC
max_user_connections for tools on labs is 10 by default. This is too low for the RENDER tools. Please increase the limit to something like 512 or remove it for SQL users 50380g50613 and p50380g50454, i.e. render and render-tests.

We need to publish the tools for the evaluation phase asap so this is urgent.


A limit of 10 will be too low for many projects. It should be increased for all tools. On the toolserver, there was no limit for MMPs.
Comment 1 Antoine "hashar" Musso (WMF) 2013-06-20 12:45:21 UTC
That is hardcoded in puppet:


modules/mysql_multi_instance/manifests/instance.pp
  'max_user_connections'        => 10,

Adding in CC a few ops people that edited that file (Peter, Asher and Marc-André).
Comment 2 Antoine "hashar" Musso (WMF) 2013-06-20 12:48:29 UTC
The role::db::labsdb class calls:


  mysql_multi_instance::instance { $instances_keys :
    instances => $instances
  }

mysql_multi_instance::instance should have a new 'max_user_connections' parameter defaulting to '10' that would let one override it in role::db::labsdb.
Comment 3 Marc A. Pelletier 2013-06-20 14:04:09 UTC
Permament fix in puppet, runtime directive in place (that will last until the next DB restart)
Comment 4 Johannes Kroll (WMDE) 2013-06-20 14:27:02 UTC
Thanks.
Comment 5 Antoine "hashar" Musso (WMF) 2013-06-20 15:28:55 UTC
Puppet change is https://gerrit.wikimedia.org/r/#/c/69648/ (unmerged right now).
Comment 6 Marc A. Pelletier 2013-06-20 20:33:50 UTC
Merged.
Comment 7 Asher Feldman 2013-06-21 17:06:05 UTC
For the record, I think the new limit of 500 is unwise, and will likely impact performance and availability at times.  50 could be ok, instead of the 10 I originally chose.  Any labs project that needs more connections to a single db should probably use mysql proxy with a limited connection pool or something similar.
Comment 8 Johannes Kroll (WMDE) 2013-06-27 16:52:54 UTC
You need to take into account that the connection limit also affects web requests, and several users might do requests at the same time. The Article List Generator opens several connections, and several programs in the Toolkit as well. We need to be able to have several people use our tools, so the raised limit for render and render-tests isn't really optional. And I am pretty sure it is the same for several other tools. 

If you don't want to raise the limit for all tools, I suggest you set a default value of, say, 20, and create a simple process to raise the limit per-tool if necessary. 

In any case, I never requested to raise the limit for all accounts by default, so don't blame us if performance is affected by somebody abusing the new default limit!
Comment 9 Johannes Kroll (WMDE) 2013-06-27 16:57:31 UTC
> In any case, I never requested to raise the limit for all accounts by
> default
*** to something as high as 512
> so don't blame us if performance is affected by somebody abusing the new
> default limit!
Comment 10 Marc A. Pelletier 2013-06-27 19:57:09 UTC
I put in an arbitrarily high limit so that (a) users are generally not limited in the number of connection but (b) no runaway user can consume all of them.

I believe that tools using too many connections is a human problem not a technical one.  :-)
Comment 11 Johannes Kroll (WMDE) 2013-11-27 13:43:02 UTC
Something must have reset the variable to its default value. But only for some wikis:

local-render@tools-login:~$ sql enwiki_p 'show variables like "max_user_connections"'
Variable_name	Value
max_user_connections	512
local-render@tools-login:~$ sql dewiki_p 'show variables like "max_user_connections"'
Variable_name	Value
max_user_connections	10
local-render@tools-login:~$ sql frwiki_p 'show variables like "max_user_connections"'
Variable_name	Value
max_user_connections	10

Can somebody please set it to 512 for all wikis permanently?
Comment 12 Marc A. Pelletier 2014-01-07 14:38:48 UTC
Sean, I agree with the request in principle; given the use patterns it'd be reasonable to up the limit by a few orders of magnitude (just low enough that no single tool can use up everything) so that bursts of activity don't hit the limit.

It was set (by hand) to 512 in the past with no issues.
Comment 13 Johannes Kroll (WMDE) 2014-01-08 14:10:08 UTC
This has been un-fixed since more than a month now. We are getting angry emails from users because the article list generator doesn't work for anything except enwiki. Please fix it.
Comment 14 Sean Pringle 2014-01-08 15:54:45 UTC
Should now be re-fixed back to 512. https://gerrit.wikimedia.org/r/#/c/106254/

Need to monitor for a while to see peak connection usage per user. From that we can pick a sane cut-off.
Comment 15 Marc A. Pelletier 2014-01-08 15:57:05 UTC
I wish mariadb had a "95th percentile max_connections".  /That/ would be useful.  :-)
Comment 16 Tim Landscheidt 2014-02-03 07:53:19 UTC
(In reply to comment #14)
> [...]
> Need to monitor for a while to see peak connection usage per user. From that
> we
> can pick a sane cut-off.

Do we now how some data to close this bug?
Comment 17 Andre Klapper 2014-03-19 14:01:20 UTC
Johannes: Any recent complaints from users because article list generator doesn't work?

Is there any work left to do here or can this be closed (see comment 14)?
Comment 18 Sean Pringle 2014-03-30 04:01:02 UTC
Regarding monitoring:

We have userstat=ON on the labsdb instances to maintain user_statistics and client_statistics tables in information_schema. The data is also being logged on db1044 for time-series reporting, but isn't exposed anywhere yet.

http://www.percona.com/doc/percona-server/5.5/diagnostics/user_stats.html

Unfortunately the user_statistics.concurrent_connections field isn't updated due to bug. However by logging user_statistics.total_connections we can still identify spikes.

So far nobody has abused max_user_connections=512, and, as Coren implied earlier, we can continue to be lenient until forced to do otherwise. Closing.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links