Last modified: 2013-07-19 04:44:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49045, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 47045 - Special:AbuseLog?wpSearchUser=${ip} takes 30-90 seconds to load, sometimes is 504
Special:AbuseLog?wpSearchUser=${ip} takes 30-90 seconds to load, sometimes is...
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
AbuseFilter (Other open bugs)
unspecified
All All
: High normal with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: performance
Depends on:
Blocks: SWMT
  Show dependency treegraph
 
Reported: 2013-04-09 14:23 UTC by Kunal Mehta (Legoktm)
Modified: 2013-07-19 04:44 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-04-09 14:23:34 UTC
Quick example: https://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchUser=90.221.128.138

<!-- Served by mw1059 in 36.766 secs. -->

This was originally reported in #wikimedia-tech by Cyberpower and then confirmed by myself and jeremyb.
Comment 1 jeremyb 2013-04-09 14:29:16 UTC
09 14:26:43 < Cyberpower678> So what's causing the problem?
09 14:27:06  * jeremyb has to assume full table scan
09 14:27:22 < jeremyb> the timing is too consistent

See also Ie331ec7ef3b461ae10
Comment 2 Sam Reed (reedy) 2013-04-09 15:53:48 UTC
mysql:wikiadmin@db1017 [enwiki]> explain SELECT * FROM abuse_filter_log LEFT JOIN abuse_filter on (af_id=afl_filter) WHERE afl_user = 0 AND afl_user_text='90.221.128.138';
+----+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
| id | select_type | table            | type   | possible_keys | key      | key_len | ref                                | rows    | Extra       |
+----+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
|  1 | SIMPLE      | abuse_filter_log | ref    | afl_user      | afl_user | 8       | const                              | 3893344 | Using where |
|  1 | SIMPLE      | abuse_filter     | eq_ref | PRIMARY       | PRIMARY  | 8       | enwiki.abuse_filter_log.afl_filter |       1 |             |
+----+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
2 rows in set (0.06 sec)
Comment 3 Sam Reed (reedy) 2013-04-09 16:07:56 UTC
mysql:wikiadmin@db1017 [enwiki]> explain SELECT * FROM abuse_filter_log LEFT JOIN abuse_filter on (af_id=afl_filter) WHERE afl_user = 0 AND afl_user_text='90.221.128.138'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: abuse_filter_log
         type: ref
possible_keys: afl_user
          key: afl_user
      key_len: 8
          ref: const
         rows: 3959592
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: abuse_filter
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: enwiki.abuse_filter_log.afl_filter
         rows: 1
        Extra:
2 rows in set (0.02 sec)




Running the query on enwiki (not going to paste results):
4 rows in set (47.63 sec)
Comment 4 Sam Reed (reedy) 2013-04-09 16:08:22 UTC
Missing indexes are missing:

http://p.defau.lt/?83qAX_9Lk44zHB1B7CKHbQ

vs

	PRIMARY KEY (afl_id),
	KEY filter_timestamp (afl_filter,afl_timestamp),
	KEY user_timestamp (afl_user,afl_user_text,afl_timestamp),
	KEY (afl_timestamp),
	KEY page_timestamp (afl_namespace, afl_title, afl_timestamp),
	KEY ip_timestamp (afl_ip, afl_timestamp),
	KEY (afl_rev_id),
	KEY (afl_log_id),
	KEY wiki_timestamp (afl_wiki, afl_timestamp)
Comment 5 Chris Steipp 2013-04-09 17:32:30 UTC
Talked with hoo about this. It looks like we probably want to modify the query so that  $conds['afl_ip'] = IP::prettifyIP( $this->mSearchUser ); when searching for an IP.

That should make good use of the (enwiki's) afl_ip index, which we probably want to add back into the extension's sql definition.
Comment 6 Aaron Schulz 2013-04-09 17:35:19 UTC
(In reply to comment #5)
> Talked with hoo about this. It looks like we probably want to modify the
> query
> so that  $conds['afl_ip'] = IP::prettifyIP( $this->mSearchUser ); when
> searching for an IP.
> 
> That should make good use of the (enwiki's) afl_ip index, which we probably
> want to add back into the extension's sql definition.

Why would a system store IPs using IP::prettifyIP() and not IP::sanitizeIP() (or a hex function preferrably)?
Comment 7 Marius Hoch 2013-04-09 18:06:54 UTC
(In reply to comment #6)
> Why would a system store IPs using IP::prettifyIP() and not IP::sanitizeIP()
> (or a hex function preferrably)?

Because it's bad? Someone wrote that back then IPv4 was up to date and it simply didn't matter what you use... now we have a lot of inconsistent data and a lot of trouble.
Comment 8 Aaron Schulz 2013-04-10 01:08:32 UTC
IP::prettifyIP() is really meant for display only, not storage. It would be disappointing to have use that.
Comment 9 Chris Steipp 2013-04-11 19:35:04 UTC
AbuseFilter stores the result of $wgRequest->getIP(), which when I look in the db of enwiki, seems to always look like the rfc5952 format, which seems to be what prettifyIP also gives.

For example, try: select afl_user, afl_user_text, afl_ip, afl_timestamp from abuse_filter_log where afl_timestamp > '20130401000000' and afl_user_text != afl_ip and afl_user = 0 AND afl_user_text like '%:%' limit 100;


Maybe we need an rfc5952 formatter in the IP class, to handle any pieces that prettifyIP doesn't?

The better, long term fix is to store all addresses as numbers, and then translate the format when we display them, but I think it would be good to get this bug fixed a little sooner than it would take to do the schema and code changes for that.
Comment 10 jeremyb 2013-04-18 15:37:07 UTC
This page is also updating user touched time for the user fetching the page. Got a lock timeout on the s1 master for that user touched update.

@ Special:AbuseLog?wpSearchUser=${ip}
Comment 11 Cyberpower678 2013-04-18 15:39:53 UTC
Load times have gotten longer to the point where I get a 504 a lot of the times now.
Comment 12 MZMcBride 2013-04-19 02:43:23 UTC
Thanks for filing this bug, Legoktm.
Comment 13 king.of.hearts.wiki 2013-04-19 03:33:00 UTC
I am also getting 504 whenever I query AbuseLog for IPs, both v4 and v6. I do not get an error for user accounts or articles.
Comment 14 Gerrit Notification Bot 2013-04-19 15:56:31 UTC
Related URL: https://gerrit.wikimedia.org/r/60021 (Gerrit Change I003a609251cf922939d8c2e49aff025d726ccf78)
Comment 15 Chris Steipp 2013-04-19 16:03:50 UTC
I just added https://gerrit.wikimedia.org/r/#/c/60021/

That *can* be deployed, which will revert the query change from Ie331ec7ef3b461ae10 and allow IPv4 queries to be returned quickly.

Comparison of query performance on enwiki below.

However, this will break querying for IPv6 ip addresses. For that, we need a slightly more involved fix. I'll start working on that right now. But if getting back functionality is worth breaking IPv6 lookups (which have been broken until Ie331ec7ef3b461ae10), then the patch can be merged and deployed, if the Stewards give an ok.




mysql:wikiadmin@db1056 [enwiki]> explain SELECT * FROM abuse_filter_log LEFT
    -> JOIN abuse_filter on (af_id=afl_filter) WHERE afl_user = 0 AND
    -> afl_user_text='90.221.128.138';
+------+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
| id   | select_type | table            | type   | possible_keys | key      | key_len | ref                                | rows    | Extra       |
+------+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
|    1 | SIMPLE      | abuse_filter_log | ref    | afl_user      | afl_user | 8       | const                              | 4343552 | Using where |
|    1 | SIMPLE      | abuse_filter     | eq_ref | PRIMARY       | PRIMARY  | 8       | enwiki.abuse_filter_log.afl_filter |       1 | Using where |
+------+-------------+------------------+--------+---------------+----------+---------+------------------------------------+---------+-------------+
2 rows in set (0.08 sec)

mysql:wikiadmin@db1056 [enwiki]> explain SELECT * FROM abuse_filter_log LEFT JOIN abuse_filter on (af_id=afl_filter) WHERE afl_ip=afl_user_text AND afl_user = 0 AND afl_user_text='90.221.128.138';
+------+-------------+------------------+--------+-----------------+---------+---------+------------------------------------+------+------------------------------------+
| id   | select_type | table            | type   | possible_keys   | key     | key_len | ref                                | rows | Extra                              |
+------+-------------+------------------+--------+-----------------+---------+---------+------------------------------------+------+------------------------------------+
|    1 | SIMPLE      | abuse_filter_log | ref    | afl_user,afl_ip | afl_ip  | 257     | const                              |    4 | Using index condition; Using where |
|    1 | SIMPLE      | abuse_filter     | eq_ref | PRIMARY         | PRIMARY | 8       | enwiki.abuse_filter_log.afl_filter |    1 | Using where                        |
+------+-------------+------------------+--------+-----------------+---------+---------+------------------------------------+------+------------------------------------+
2 rows in set (0.04 sec)
Comment 16 Asher Feldman 2013-04-19 18:06:02 UTC
I've added the user_timestamp index to enwiki, the original query is now fast:

mysql:wikiadmin@db1056 [enwiki]> explain SELECT * FROM abuse_filter_log LEFT     JOIN abuse_filter on (af_id=afl_filter) WHERE afl_user = 0 AND     afl_user_text='90.221.128.138'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: abuse_filter_log
         type: ref
possible_keys: user_timestamp
          key: user_timestamp
      key_len: 265
          ref: const,const
         rows: 4
        Extra: Using index condition
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: abuse_filter
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: enwiki.abuse_filter_log.afl_filter
         rows: 1
        Extra: Using where
2 rows in set (0.03 sec)

Probably still missing on most other projects.
Comment 17 Kunal Mehta (Legoktm) 2013-04-20 23:20:06 UTC
(In reply to comment #0)
> Quick example:
> https://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchUser=90.
> 221.128.138
> 
> <!-- Served by mw1059 in 36.766 secs. -->
> 
Same link now is <!-- Served by mw1062 in 0.254 secs. --> so it looks fixed for enwiki.

Should a new bug be filed to get the index created on all the other projects using the AbuseFilter?
Comment 18 Chris Steipp 2013-04-22 17:14:28 UTC
Wikis created after 2010 have the index already, so it would probably be easiest if you add specific wikis where this problem is showing up. Either here or a new bug.
Comment 19 MZMcBride 2013-04-23 01:46:56 UTC
(In reply to comment #17)
> Should a new bug be filed to get the index created on all the other projects
> using the AbuseFilter?

I think a new bug should be filed about getting this index on all Wikimedia wikis (all Wikimedia wikis have the AbuseFilter extension enabled). I don't think we want to burden ourselves with an inconsistent schema for this extension across wikis. I think it'll just end up making debugging future issues more difficult and annoying.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links