Last modified: 2011-03-13 18:05:11 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3518, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1518 - compressOld.php and compressOld.inc live now support arbitrary conditions
compressOld.php and compressOld.inc live now support arbitrary conditions
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.4.x
All All
: Lowest normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-02-13 00:20 UTC by Jamesday
Modified: 2011-03-13 18:05 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Jamesday 2005-02-13 00:20:47 UTC
If suitable, please merge the live compressOld.php and 
compressOld.inc from /home/wikipedia/common/php-
1.4/maintenance with the 1.4 and 1.5 CVS versions.

I received a request to exclude categories and their tallk 
pags, which are currently in considerable flux, from the 
concatenated compression to make it easier to delete them. I 
implemented that by adding support for arbitrary SQL 
restrictions in the query which selects which articles to 
compress. No safety checks - it's a raw SQL inclusion into 
the query, which seems OK for a maintenance script.

It's currently running live on the site, most recently 
started like this:

nice php compressOld.php en wikipedia -e 20050108000000 -q " 
cur_namespace not in (10,11,14,15) " -a Burke | tee -
a /home/wikipedia/logs/compressOld/20050108enwiki

Now shows the query when starting it, in part because it can 
take 700 seconds to run and in part to show the query in 
case there's a problem with it:

Starting article selection query cur_title >= 'Burke' AND  
cur_namespace not in (10,11,14,15)  ...

This one is excluding template, category and their talk 
pages.

EXPLAIN /* compressWithConcat */ SELECT 
cur_namespace,cur_title FROM `cur` WHERE cur_title 
>= 'Burke' AND cur_namespace not in (10,11,14,15) ORDER BY 
cur_title:

*** row 1 ***
          table:  cur
           type:  index
  possible_keys:  cur_title
            key:  cur_title
        key_len:  255
            ref:  NULL
           rows:  1420880
          Extra:  Using where

No problems with the explain result.

Priority set to high because someone is going to hit a 
conflict for this when pushing CVS to the live site if it's 
not merged first.
Comment 1 Jamesday 2005-02-14 09:26:56 UTC
Bug fix for the change in the live version - included an and for the 
extra condition when it wasn't necessary.
Comment 2 Jamesday 2005-02-15 12:54:04 UTC
Now includes a partial fix for the case where the concatenated 
version would stop with a disconnected from database server error 
after processing a large number of old record updates (15,000+ seen 
in one case) - slaves are now checked for lag/pinged after every 500 
old record examinations/updates. Also checked before starting for any 
case with (currently 200) old records to consider.

It's still possible for the script to be disconnected from the master 
when the script gets a large number of old records and takes many 
minutes loading the results.
Comment 3 SJ 2005-05-13 06:01:45 UTC
seems fixed, reducing priority.
Comment 4 Tim Starling 2008-12-30 03:22:13 UTC
Never merged, but the patch is no longer required since the deletion bug is fixed. 

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links