Last modified: 2014-03-14 01:00:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T64620, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 62620 - Add index on revision user data for contributions lookup
Add index on revision user data for contributions lookup
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Sean Pringle
:
Depends on:
Blocks: 49188
  Show dependency treegraph
 
Reported: 2014-03-13 20:58 UTC by Erik Bernhardson
Modified: 2014-03-14 01:00 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Erik Bernhardson 2014-03-13 20:58:49 UTC
Short description:

Add an index for answering contributions queries.

Example queries:

All queries below may also have a condition of

    rev_id > :someBinaryUid
or
    rev_id < :someBinaryUid

for pagination

The FROM/JOIN statement can also switch to:

    FROM flow_revision
    INNER JOIN flow_header_revision ON header_rev_id = rev_id
    INNER JOIN flow_workflow ON workflow_id = header_workflow_id

SELECT *
  FROM flow_revision
  INNER JOIN flow_tree_revision ON tree_rev_id = rev_id
  INNER JOIN tree_descendant_id = tree_rev_descendant_id
  INNER JOIN flow_workflow ON workflow_id = tree_ancestor_id
 WHERE rev_user_id > :someId
   AND rev_user_ip IS NULL 
   AND rev_user_wiki = :someWiki
   AND workflow-wiki = :someWiki
 ORDER BY rev_id DESC
 LIMIT :limit

SELECT *
  FROM flow_revision
  INNER JOIN flow_tree_revision ON tree_rev_id = rev_id
  INNER JOIN tree_descendant_id = tree_rev_descendant_id
  INNER JOIN flow_workflow ON workflow_id = tree_ancestor_id
 WHERE rev_user_wiki = :someWiki
   AND rev_user_id = :someId
   AND rev_user_ip IS NULL;
 ORDER BY rev_id DESC
 LIMIT :limit

SELECT *
  FROM flow_revision
  INNER JOIN flow_tree_revision ON tree_rev_id = rev_id
  INNER JOIN tree_descendant_id = tree_rev_descendant_id
  INNER JOIN flow_workflow ON workflow_id = tree_ancestor_id
 WHERE rev_user_wiki = :someWiki
   AND rev_user_id IS NULL
   AND rev_user_ip = :someIp;
 ORDER BY rev_id DESC
 LIMIT :limit

We are aware these queries are well less than efficient, to many joins especially for the ones that join against flow_tree_revision.  We are revisting
the data modeling to figure out how more efficiently model the data.

Which wikis are affected:

flowdb on the external(non-wiki) db cluster

Which tables:

flow_revision

What is the change to those tables:

CREATE INDEX /*i*/flow_revision_user 
    ON /*_*/flow_revision (rev_user_id, rev_user_ip, rev_user_wiki);

Links to gerrit changes and/or other related bug reports.

https://gerrit.wikimedia.org/r/#/c/116115/

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links