Last modified: 2011-03-13 18:05:11 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 1518 - compressOld.php and live now support arbitrary conditions
compressOld.php and live now support arbitrary conditions
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Lowest normal (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
  Show dependency treegraph
Reported: 2005-02-13 00:20 UTC by Jamesday
Modified: 2011-03-13 18:05 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Jamesday 2005-02-13 00:20:47 UTC
If suitable, please merge the live compressOld.php and from /home/wikipedia/common/php-
1.4/maintenance with the 1.4 and 1.5 CVS versions.

I received a request to exclude categories and their tallk 
pags, which are currently in considerable flux, from the 
concatenated compression to make it easier to delete them. I 
implemented that by adding support for arbitrary SQL 
restrictions in the query which selects which articles to 
compress. No safety checks - it's a raw SQL inclusion into 
the query, which seems OK for a maintenance script.

It's currently running live on the site, most recently 
started like this:

nice php compressOld.php en wikipedia -e 20050108000000 -q " 
cur_namespace not in (10,11,14,15) " -a Burke | tee -
a /home/wikipedia/logs/compressOld/20050108enwiki

Now shows the query when starting it, in part because it can 
take 700 seconds to run and in part to show the query in 
case there's a problem with it:

Starting article selection query cur_title >= 'Burke' AND  
cur_namespace not in (10,11,14,15)  ...

This one is excluding template, category and their talk 

EXPLAIN /* compressWithConcat */ SELECT 
cur_namespace,cur_title FROM `cur` WHERE cur_title 
>= 'Burke' AND cur_namespace not in (10,11,14,15) ORDER BY 

*** row 1 ***
          table:  cur
           type:  index
  possible_keys:  cur_title
            key:  cur_title
        key_len:  255
            ref:  NULL
           rows:  1420880
          Extra:  Using where

No problems with the explain result.

Priority set to high because someone is going to hit a 
conflict for this when pushing CVS to the live site if it's 
not merged first.
Comment 1 Jamesday 2005-02-14 09:26:56 UTC
Bug fix for the change in the live version - included an and for the 
extra condition when it wasn't necessary.
Comment 2 Jamesday 2005-02-15 12:54:04 UTC
Now includes a partial fix for the case where the concatenated 
version would stop with a disconnected from database server error 
after processing a large number of old record updates (15,000+ seen 
in one case) - slaves are now checked for lag/pinged after every 500 
old record examinations/updates. Also checked before starting for any 
case with (currently 200) old records to consider.

It's still possible for the script to be disconnected from the master 
when the script gets a large number of old records and takes many 
minutes loading the results.
Comment 3 SJ 2005-05-13 06:01:45 UTC
seems fixed, reducing priority.
Comment 4 Tim Starling 2008-12-30 03:22:13 UTC
Never merged, but the patch is no longer required since the deletion bug is fixed. 

Note You need to log in before you can comment on or make changes to this bug.