Last modified: 2009-09-25 00:09:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T22774, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 20774 - Enabling LocalisationUpdate vastly increases CPU activity
Enabling LocalisationUpdate vastly increases CPU activity
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
LocalisationUpdate (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Roan Kattouw
http://techblog.wikimedia.org/wp-cont...
:
Depends on:
Blocks: 18604 19312
  Show dependency treegraph
 
Reported: 2009-09-22 22:02 UTC by Brion Vibber
Modified: 2009-09-25 00:09 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Brion Vibber 2009-09-22 22:02:10 UTC
CPU usage went wayyy up when enabling LocalisationUpdate:
http://techblog.wikimedia.org/wp-content/uploads/2009/09/broke.png

The empty space in the middle is where bug 20773 killed the site after disabling the extension; after fixing that, CPU usage went back up, then back to normal as caches were rebuilt.

Not deployable in this state; CPU usage needs to be tracked down and cleared up. Is it breaking the cache infrastructure, or is pulling extra stuff from DBs when we've got main localization in CDB files inefficient?
Comment 1 Roan Kattouw 2009-09-23 07:12:38 UTC
Note that l10nupdate's installation triggers invalidation of the l10ncache, causing it to be rebuilt from scratch. Try making a whitespace change to MessagesEn.php and syncing that, then see what the resulting CPU spike looks like. Also, please investigate how well-synchronized the Apaches' clocks are.


(In reply to comment #0)
> Not deployable in this state; CPU usage needs to be tracked down and cleared
> up. Is it breaking the cache infrastructure, or is pulling extra stuff from DBs
> when we've got main localization in CDB files inefficient?
> 
I'll look into the CPU usage; debug logs from earlier local test runs show that l10nupdate is not pulling localizations from the DB once all its stuff is in the l10ncache, however. Offhand, I think the dependency check may hit the DB, but that shouldn't double CPU usage AFAICT.
Comment 2 Brion Vibber 2009-09-23 17:44:31 UTC
Reassigning to Roan.
Comment 3 Roan Kattouw 2009-09-23 20:13:28 UTC
Hopefully fixed with the rewrite in r56831.

Basically, the two major culprits were:
1. the code checking the timestamp of the last update.php run (to determine whether to rebuild the l10ncache) pulled stale data from the slaves, and wasn't smart enough to use queriedTimestamp > expectedTimestamp instead of !=
2. the initial update.php run inserted about a million rows in each of the 5 per-cluster localisation tables, using a separate REPLACE statement for each row; this presumably slowed down replication and worsened #1.

LU now uses a file-based storage system.
Comment 4 Brion Vibber 2009-09-24 22:19:55 UTC
This is now believed to be fixed :)

Doing a more conservative progressive production rollout to confirm this...
Comment 5 Brion Vibber 2009-09-25 00:09:25 UTC
Yay! System is much happier now :D

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links