Last modified: 2007-07-07 04:54:21 UTC
By using the API it is very easy to overload the servers even if you don't want to do anything harmful. By designing an application I accidently created to much load, seen with the database lag shown in watchlist and contributions - I could watch increasing the lag nearly in real time. There should be added two things: 1. a query for the lag of database servers - give applications a chance to recognize heavy load and to react 2. throttle down the API queries inside the API if the lag exceeds a given value or give an overload error - this is needed if an application has does not examine (1) or want's to harm the servers. If you need my demo queries, I'll send them per mail but I don't want to present them to the public at the moment.
Note the existence of "maxlag" - http://www.mediawiki.org/wiki/Maxlag_parameter, which is directed more at people making write queries, but which might also be useful guidance for scripts. In general, it might be wise to consider throttling API requests, but at the same time, we don't want to make the API useless. An additional help might be directing general queries to a separate query group, which could then be later configured to a distinct set of slave database servers. Storing certain, more expensive API results in the object cache is something else that might be helpful, or perhaps encouraging clients and shared caches to store results for a short period of time in some cases.
In r23823, added db server replication lag information in meta=siteinfo. I will close the bug for now, but if it's not enough, please reopen and provide additional requests.