Last modified: 2011-03-13 18:05:11 UTC
If suitable, please merge the live compressOld.php and
compressOld.inc from /home/wikipedia/common/php-
1.4/maintenance with the 1.4 and 1.5 CVS versions.
I received a request to exclude categories and their tallk
pags, which are currently in considerable flux, from the
concatenated compression to make it easier to delete them. I
implemented that by adding support for arbitrary SQL
restrictions in the query which selects which articles to
compress. No safety checks - it's a raw SQL inclusion into
the query, which seems OK for a maintenance script.
It's currently running live on the site, most recently
started like this:
nice php compressOld.php en wikipedia -e 20050108000000 -q "
cur_namespace not in (10,11,14,15) " -a Burke | tee -
Now shows the query when starting it, in part because it can
take 700 seconds to run and in part to show the query in
case there's a problem with it:
Starting article selection query cur_title >= 'Burke' AND
cur_namespace not in (10,11,14,15) ...
This one is excluding template, category and their talk
EXPLAIN /* compressWithConcat */ SELECT
cur_namespace,cur_title FROM `cur` WHERE cur_title
>= 'Burke' AND cur_namespace not in (10,11,14,15) ORDER BY
*** row 1 ***
Extra: Using where
No problems with the explain result.
Priority set to high because someone is going to hit a
conflict for this when pushing CVS to the live site if it's
not merged first.
Bug fix for the change in the live version - included an and for the
extra condition when it wasn't necessary.
Now includes a partial fix for the case where the concatenated
version would stop with a disconnected from database server error
after processing a large number of old record updates (15,000+ seen
in one case) - slaves are now checked for lag/pinged after every 500
old record examinations/updates. Also checked before starting for any
case with (currently 200) old records to consider.
It's still possible for the script to be disconnected from the master
when the script gets a large number of old records and takes many
minutes loading the results.
seems fixed, reducing priority.
Never merged, but the patch is no longer required since the deletion bug is fixed.