Last modified: 2014-08-27 22:43:16 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T69602, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 67602 - Performance problem on database server s5 using commonswiki
Performance problem on database server s5 using commonswiki
Status: ASSIGNED
Product: Wikimedia Labs
Classification: Unclassified
tools (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Marc A. Pelletier
: performance
Depends on:
Blocks: labs-replication tool-missing-ts-feat merl-tools
  Show dependency treegraph
 
Reported: 2014-07-07 16:19 UTC by merl
Modified: 2014-08-27 22:43 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description merl 2014-07-07 16:19:51 UTC
Some queries are currently not runnable on s5.labsdb because of performance problems. Probably this is caused by federated tables.

Either these queries have not finished after hours and i kill them or i am getting errors caused by out of memory (lost connection, Table './mysql/proc' is marked as crashed)

I have many many complex queries in my scripts using commonswiki.
Here is one example query my bot runs every four hours:
It returns a list of files requested for deletions on commons with a link to deletion talk page section which are used on dewiki presentation namespace and the deletion request page was modified within the last two weeks by a human.
(the "used on dewiki" is not 100% correct because of performance reason i am using the big globalimagelinks in a second query, but that shouldn't matter here - only if sb. wants to copy the query for his own tools)
----------
on WMDE toolserver

 SELECT 'commonswiki', cp.page_id,
   CONCAT('[[:Datei:',REPLACE(cp.page_title,'_',' '),']]<small> [[:commons:Commons:Deletion requests/',REPLACE(SUBSTRING(cl_sortkey_prefix,LOCATE('#',cl_sortkey_prefix)+1),'_',' '),'|CDR]]</small>')
  FROM commonswiki_p.templatelinks
   INNER JOIN commonswiki_p.page cp ON tl_from = cp.page_id
   INNER JOIN commonswiki_p.recentchanges ON rc_cur_id = cp.page_id AND rc_bot=0
   INNER JOIN dewiki_p.imagelinks ON il_to = cp.page_title
   INNER JOIN dewiki_p.page dp ON il_from = dp.page_id
   INNER JOIN commonswiki_p.categorylinks ON cp.page_id=cl_from AND cl_to LIKE 'Deletion_requests_%'
  WHERE tl_namespace=10 AND tl_title='Delete' AND cp.page_namespace=6
   AND DATEDIFF(NOW(), rc_timestamp) < 14
   AND dp.page_namespace IN (0,6,10,14,100) AND cl_sortkey_prefix LIKE '%#%'
  GROUP BY cp.page_id
  ORDER BY NULL;

Result ON TS: 85 rows in set (0.25 sec)
----------
Query for wmf Labs:

 SELECT 'commonswiki', cp.page_id,
   CONCAT('[[:Datei:',REPLACE(cp.page_title,'_',' '),']]<small> [[:commons:Commons:Deletion requests/',REPLACE(SUBSTRING(cl_sortkey_prefix,LOCATE('#',cl_sortkey_prefix)+1),'_',' '),'|CDR]]</small>')
  FROM commonswiki_f_p.templatelinks
   INNER JOIN commonswiki_f_p.page cp ON tl_from = cp.page_id
   INNER JOIN commonswiki_f_p.recentchanges ON rc_cur_id = cp.page_id AND rc_bot=0
   INNER JOIN dewiki_p.imagelinks ON il_to = cp.page_title
   INNER JOIN dewiki_p.page dp ON il_from = dp.page_id
   INNER JOIN commonswiki_f_p.categorylinks ON cp.page_id=cl_from AND cl_to LIKE 'Deletion_requests_%'
  WHERE tl_namespace=10 AND tl_title='Delete' AND cp.page_namespace=6
   AND DATEDIFF(NOW(), rc_timestamp) < 14
   AND dp.page_namespace IN (0,6,10,14,100) AND cl_sortkey_prefix LIKE '%#%'
  GROUP BY cp.page_id
  ORDER BY NULL;

aborted by me after 3 hours.

Please optimize database performance for queries joining with commonswiki.
Comment 1 Marc A. Pelletier 2014-07-07 17:31:14 UTC
That query is unlikely to be usable as-in with federation, at least not in its current form (using a federated table requires that the rows selected during a join are strictly constrained on indexed columns).  In particular, the way the joins are structured will require several full table scans because the relations are not visible to federation, only the presence or absence of indices.

Arguably, doing this at the application level is the only correct solution as the load on the database would be orders of magnitude smaller and the queries much more manageable.

I will try to coordinate resources to help you adapt those bots as quickly as possible.
Comment 2 Sam Reed (reedy) 2014-07-07 17:49:47 UTC
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: templatelinks
         type: ref
possible_keys: tl_from,tl_namespace,tl_backlinks_namespace
          key: tl_namespace
      key_len: 261
          ref: const,const
         rows: 1
        Extra: Using where; Using index; Using temporary
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: categorylinks
         type: ref
possible_keys: cl_from,cl_timestamp,cl_sortkey
          key: cl_from
      key_len: 4
          ref: dewiki.templatelinks.tl_from
         rows: 1
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: cp
         type: eq_ref
possible_keys: PRIMARY,name_title
          key: PRIMARY
      key_len: 4
          ref: dewiki.templatelinks.tl_from
         rows: 1
        Extra: Using where
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: recentchanges
         type: ref
possible_keys: rc_cur_id
          key: rc_cur_id
      key_len: 4
          ref: dewiki.templatelinks.tl_from
         rows: 1
        Extra: Using where
*************************** 5. row ***************************
           id: 1
  select_type: SIMPLE
        table: imagelinks
         type: ref
possible_keys: il_from,il_to
          key: il_to
      key_len: 257
          ref: dewiki.cp.page_title
         rows: 5
        Extra: Using index
*************************** 6. row ***************************
           id: 1
  select_type: SIMPLE
        table: dp
         type: eq_ref
possible_keys: PRIMARY,name_title
          key: PRIMARY
      key_len: 4
          ref: dewiki.imagelinks.il_from
         rows: 1
        Extra: Using where
6 rows in set (0.01 sec)
Comment 3 merl 2014-07-12 09:03:11 UTC
I just divided another complex query into multiple queries for a performance test. This searches for included not existing images of a single article (there is only one image (Icon tools.svg) embedded in this article 8188489, so DF_temp has only one row):

CREATE TEMPORARY TABLE IF NOT EXISTS DF_temp (it VARCHAR(255) PRIMARY KEY);
DELETE FROM DF_temp;
INSERT IGNORE INTO DF_temp (it)
 SELECT STRAIGHT_JOIN il_to
  FROM dewiki_p.page p1
   INNER JOIN dewiki_p.imagelinks ON p1.page_id=il_from
   WHERE p1.page_id = 8188489;	
DELETE DF_temp
 FROM DF_temp
  INNER JOIN dewiki_p.image ON it=img_name;
DELETE DF_temp
 FROM DF_temp
  INNER JOIN commonswiki_f_p.image ON it=img_name;

The last query fails after the timeout of 10 minutes it set on labs. img_name is primary key, so also your condition "rows selected during a join are strictly constrained on indexed columns" is true here.

There are also two additional queries checking for redirects to images. With these additional queries runtime on TS is about 0.08 seconds.
Comment 4 Marc A. Pelletier 2014-07-12 13:31:16 UTC
Ah, there's clearly something stupid in the way the query is planned, because:

select * FROM commonswiki_f_p.image where img_name in('Icon_tools.svg', 'Qsicon_Fokus2.svg');

instead of the third statement returns:

1 row in set (0.00 sec)

which is as expected, and the planner explains that when doing your third query it is specifically _not_ using a key(!)

Likewise:

select * FROM commonswiki_f_p.image where img_name in(select it from DF_temp);

times out (explain shows no index used).  Where the problem lies might be in that the query planner somehow things there are no rows in the federated table.  For that matter, even forcing the index fails to use it:

select img_bits FROM DF_temp  inner JOIN commonswiki_f_p.image force index (primary) ON it=img_name;

also times out.

This is clearly incorrect behaviour; I'll be looking into it.
Comment 5 Marc A. Pelletier 2014-07-17 12:50:50 UTC
No further efforts trying to fix federation will be spent as we are upgrading to MariaDB 10 which will provide commons locally and do away with federation entirely.

The first database being transitioned is s5 (currently in progress), solving this for the problem at hand.
Comment 6 Marc A. Pelletier 2014-07-24 22:32:23 UTC
@merl; how did the transition to MariaDB 10 go?
Comment 7 merl 2014-07-24 22:52:39 UTC
After the change to mariadb10 nearly all of my script always failed with one of two different errors:
ERROR 2013 (HY000): Lost connection to MySQL server during query
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

I solved this by prepending all sql scripts with
SET SESSION tx_isolation='READ-UNCOMMITTED';

Sean also raised the lock timeout in between. I haven't checked if this makes my session based transaction isolation level change unnecessary.

All scripts/queries using joins with commonswiki are running successful now.
But i have not tested all scripts today. I'll make a complete report within the next two days.
There seems at least a problem with two scripts querying wikidata only (wikidatawiki_p.wb_terms). I am still investigating in this problem. These had been running on s5.labsdb with mariadb5 without problems.
Comment 8 Marc A. Pelletier 2014-08-27 22:43:16 UTC
Any news on this that isn't otherwise tracked with Sean regarding performance issues with the new database?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links