Last modified: 2012-09-19 20:52:41 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36104, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34104 - enable revision.rev_sha1 on enwiki
enable revision.rev_sha1 on enwiki
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Aaron Schulz
:
Depends on:
Blocks: 36464
  Show dependency treegraph
 
Reported: 2012-01-31 21:53 UTC by Rob Lanphier
Modified: 2012-09-19 20:52 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Rob Lanphier 2012-01-31 21:53:36 UTC
Full list of schema changes being built here:
http://etherpad.wikimedia.org/119deployment
Comment 1 Asher Feldman 2012-02-01 19:10:47 UTC
How complete is the list of schema changes on the etherpad?  Is there a target date prior to the start of deployments to freeze the schema?

The sha1 field addition may be the most time consuming change ever done at wmf.  As it's backwards compatible, I'd prefer to make the schema changes sooner than later.

* array( 'addField', 'revision',      'rev_sha1',         'patch-rev_sha1.sql' ),
* array( 'addField', 'archive',       'ar_sha1',          'patch-ar_sha1.sql' ),


The other two currently listed should be backwards compatible as well. 

* array( 'addIndex', 'page', 'page_redirect_namespace_len', 'patch-page_redirect_namespace_len.sql' ),
* array( 'addIndex', 'logging',       'type_action',      'patch-logging-type-action-index.sql'),

While patch-drop-user_options.sql can be delayed til 1.20.
Comment 2 Sam Reed (reedy) 2012-02-02 16:22:36 UTC
(In reply to comment #1)
> How complete is the list of schema changes on the etherpad?  Is there a target
> date prior to the start of deployments to freeze the schema?

That's it for core, I've still got to go through the extensions, but I don't seem to recall there being too many (any?) changes, or if there is, it's certainly not going to be as bad

Do need to get people to confirm whether they want the 1.18wmf1 (hence, no changes at all) versions of extensions bringing into 1.19wmf1, or they want the trunk version of extensions (which there might be changes in)

I'll see about getting the list done for trunk generating, and we can cross stuff out if they aren't needed

(In reply to comment #1)
> The sha1 field addition may be the most time consuming change ever done at wmf.
>  As it's backwards compatible, I'd prefer to make the schema changes sooner
> than later.
> 
> * array( 'addField', 'revision',      'rev_sha1',         'patch-rev_sha1.sql'
> ),
> * array( 'addField', 'archive',       'ar_sha1',          'patch-ar_sha1.sql'
> ),

At least it's only adding the column, not populating it at the same time!
Comment 3 Sam Reed (reedy) 2012-02-02 19:30:02 UTC
Other than an index to be added on the mediawikiwiki code review table (CREATE INDEX /*i*/repo_path ON /*_*/code_paths (cp_repo_id, cp_path); -- 1.4M rows), there are no other database updates that I can see that need anything doing.

Obviously, adding the index is backwards compatible and can just be added, maybe even without master switching

Translate for example has a new table, that's fine to be just done ad-hoc

Rest of features stuff is deployed pretty much at trunk level, so Roan/whoever will have applied the schema updates needed

Maintenance script for CentralAuth needs running somewhere near deploy time, but that's trivial
Comment 4 Asher Feldman 2012-02-08 01:53:41 UTC
The revision.rev_sha1 column addition will be problematic on enwiki in our current production environment.

I took the enwiki slave db38 out of the production and ran:

mysql -e "SET SQL_LOG_BIN = 0; ALTER TABLE /*$wgDBprefix*/revision ADD rev_sha1 varbinary(32) NOT NULL default ''" enwiki

It took over 27 hours to complete, and may take another ~8 hours for replication to catch up.  We can go ahead and do this on every db individually, but it will take quite a while, and may require pushing back the release cycle.  It also increased ibdata1 by 200GB.

Due to the huge amount of time required, we may want to complete the other migrations on db38 and then rebuild all of the other enwiki db's from a hotbackup of it, except for analytics systems where this isn't an option.  We could also take the opportunity to redo db38 from a full sql dump of enwiki in order to switch to file-per-table mode, but this could take an additional week.

Or we can scuttle this change for a future release and until then, examine ways to partition the revision table or store it differently, obtain SSD's for enwiki db's, or store sha1's in their own table and push the cost to queries that join against it.  As its currently unindexed, it doesn't appear it will be of any use except in offline cases anyways.


FYI, I'm also experimenting with Percona's pt-online-schema-change tool, since our revision table meets its requirement of having a single column unique index.  

On db1043, an enwiki slave in eqiad, I am running :

pt-online-schema-change t=enwiki.revision --alter "ADD rev_sha1 varbinary(32) NOT NULL default ''" --sleep 0.1

This writes 1000 rows at a time to a __tmp_revision table and then sleeps 100ms between chunks.  Replication couldn't keep up at all with more aggressive settings.  With these settings, replication kept up for around 12 hours, but is now 3.5 hours behind.  It's been running for 25 hours and the tool estimates it will require 29 more hours to complete, by which time it will be further behind on replication.  We'd need a larger sleep and/or smaller chunks to avoid seriously backing up replication, which may not even work.
Comment 5 Sam Reed (reedy) 2012-02-08 14:40:42 UTC
Revision rows:
enwiki -  460014306
dewiki -  87901921
commons - 42765689

So, dewiki is the next biggest wiki, with enwiki being over 5 times larger (in respect of the number of revision rows)

I do believe the sha1 is basically populated and such throughout the code, but it's only real use is in the xml dumps

We would need to comment out explicit reads/writes on the field, but shouldn't be too painful code wise.

Would this mean the smaller wikis should be doable somewhat easily? ie dewiki should be under 10 hours (giving some margin for guestimation)

Enwiki is obviously will be one of the latter wikis to be upgraded to 1.19

archive rows:
enwiki -  35621907
dewiki -  13702308
commons - 5344901

I'm presuming doing archives ar_sha1 is going to be quite a bit quicker? revision is nearly 13 times larger than archive is on enwiki


Could we get away with just doing a delayed release of this. So the first XML dumps after 1.19 don't have any revision sha1 hashes? Oh well. Obviously they're not greedily populated, so it's not going to be much of an issue.
Comment 6 Asher Feldman 2012-02-08 22:05:06 UTC
rev_sha1 is doable on all wikis other than en in a reasonable time frame, including as an online change for all wikis < de.  ar_sha1 is fine on all wikis off the bat. 

Can use of the sha1 tables be a wfConf setting that is set to false for enwiki?  If so, default could likely be true.

<insert additional grumpy remark about bloating production oltp databases with data only of use for offline analysis>
Comment 7 Sam Reed (reedy) 2012-02-08 22:38:50 UTC
If rev_sha1 is the main issue.. For WMF deployment, there are only 9 uses that might need some work...

Anything doing actual database queries needs making optional, but a lot of the uses in code fall back fairly nicely. I'd guessing only 4 or 5 of them will need actively patching, so that's more than doable

e.g.

			if ( !isset( $row->rev_sha1 ) ) {
				$this->mSha1 = null;
			} else {
				$this->mSha1 = $row->rev_sha1;
			}
Comment 8 Rob Lanphier 2012-02-17 22:59:05 UTC
Asher is going to be out of the office until Feb 26.  Here's the latest update from him: 

"Everything is done except for the s1, s5, and s6 masters.  (s1 is not getting the revision sha1 table pre-1.19, so that alter isn't included in "everything")

"I was planning to swap those three masters and re-run the migration script early next week but if someone else would like to do it, let me know. If so, make sure the eqiad secondary in the replication tree gets switched to the new master as well along with the dns and puppet changes noted at http://wikitech.wikimedia.org/view/Switch_master. "
Comment 9 Asher Feldman 2012-02-28 19:58:37 UTC
All migrations are done, except for revision.rev_sha1 on enwiki.  That alter has only been applied to the host promoted to master yesterday - I had previously run it on one host as a timing benchmark and chose that host to become master to avoid the need for a second switch.

I'd like to start that last enwiki migrations after the 1.19 deployment is complete but its important that the logic on whether to use revision.rev_sha1 is not based on the existence of that column on the master.  It exists but any write to it will break replication to all slaves and likely bring down the site.
Comment 10 Sam Reed (reedy) 2012-02-28 20:29:01 UTC
(In reply to comment #9)
> All migrations are done, except for revision.rev_sha1 on enwiki.  That alter
> has only been applied to the host promoted to master yesterday - I had
> previously run it on one host as a timing benchmark and chose that host to
> become master to avoid the need for a second switch.
> 
> I'd like to start that last enwiki migrations after the 1.19 deployment is
> complete but its important that the logic on whether to use revision.rev_sha1
> is not based on the existence of that column on the master.  It exists but any
> write to it will break replication to all slaves and likely bring down the
> site.

We can just leave the whole feature disabled on enwiki for the time being, it's going to make very little difference! :)
Comment 11 Rob Lanphier 2012-02-29 00:27:30 UTC
Leaving this open since we still need to get the sha1 stuff done for enwiki, but bumping down the priority and taking this off as a 1.19wmf1 deployment blocker.
Comment 12 Mark A. Hershberger 2012-03-01 21:35:10 UTC
tightening summary and moving to 1.20 deployment
Comment 13 Asher Feldman 2012-04-10 18:26:01 UTC
With the last enwiki analytic db completed today, this feature may now be enabled.
Comment 14 Aaron Schulz 2012-04-10 18:26:32 UTC
(In reply to comment #13)
> With the last enwiki analytic db completed today, this feature may now be
> enabled.

After running a population script :)
Comment 15 Sam Reed (reedy) 2012-04-10 19:21:20 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > With the last enwiki analytic db completed today, this feature may now be
> > enabled.
> 
> After running a population script :)

We didn't run it for any of the other wikis, did we?
Comment 16 Rob Lanphier 2012-04-18 16:34:09 UTC
Aaron, will this automatically happen as a result of deploying 1.20wmf1 to enwiki?  If so, are we ready for it to happen automatically happen?
Comment 17 Sam Reed (reedy) 2012-04-18 16:35:46 UTC
(In reply to comment #16)
> Aaron, will this automatically happen as a result of deploying 1.20wmf1 to
> enwiki?  If so, are we ready for it to happen automatically happen?

Indeed, we haven't backported the changes.

We might as well just remove the enwiki entry in wmfUseRevSha1Columns and let it get on with it

No point trying to back the code out of 1.19wmf1
Comment 18 Sam Reed (reedy) 2012-04-18 16:49:41 UTC
Done

It's going to be lazy populated, unless we find a reason to do it via a maintenance script. Which for obvious reasons, on enwiki especially, it's not going to be quick
Comment 19 MZMcBride 2012-04-18 22:33:26 UTC
(In reply to comment #18)
> It's going to be lazy populated, unless we find a reason to do it via a
> maintenance script. Which for obvious reasons, on enwiki especially, it's not
> going to be quick

What does "lazy populated" mean here?
Comment 20 Sam Reed (reedy) 2012-04-18 23:08:01 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > It's going to be lazy populated, unless we find a reason to do it via a
> > maintenance script. Which for obvious reasons, on enwiki especially, it's not
> > going to be quick
> 
> What does "lazy populated" mean here?

Hmm. It's not even lazy populated. I thought it was updated (if null) on load of revisions or something, it's a while since I'd looked at the code.

It's only going to be set on creation of new revisions. This is no different to any of the other wikis. If we run the maintenance script to populate these (on enwiki especially), it's going to take a long time.

It's a case of whether "we" (the community, analysts, whoever) need the information, in which case, it's worth

That should be opened as a new bug request if desired.
Comment 21 MZMcBride 2012-04-18 23:17:33 UTC
(In reply to comment #20)
> It's only going to be set on creation of new revisions. This is no different to
> any of the other wikis. If we run the maintenance script to populate these (on
> enwiki especially), it's going to take a long time.

You've mentioned that it's going to take a long time twice now. It's not a race. Nobody's going to require that you hold your breath while the script runs. It doesn't matter how long it takes to populate (and the sooner someone starts it, the sooner it'll be finished!).

> It's a case of whether "we" (the community, analysts, whoever) need the
> information, in which case, it's worth
> 
> That should be opened as a new bug request if desired.

I can think of few ideas more daft than spending all of this time and energy adding the columns only to not populate them. A separate bug makes sense. Filed as bug 36081.
Comment 22 Aaron Schulz 2012-04-19 02:47:41 UTC
There is a script to populate this which should be run soon.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links