Last modified: 2011-03-13 18:06:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T5149, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 3149 - Increase the cached set size for query pages
Increase the cached set size for query pages
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Lowest enhancement with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/w/index.php?t...
: shell
: 8450 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-08-14 22:18 UTC by Kellen
Modified: 2011-03-13 18:06 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 SJ 2006-01-21 02:08:11 UTC
See also bug <a href="http://bugzilla.wikipedia.org/show_bug.cgi?id=2415">2415</a>, re: lonelypages.  The same is also true for ancientpages... static dumps of the full lists, along with a quick count of the # of matches for those queries, would be very helpful; the latter on the Special:Statistics page.
Comment 2 Kellen 2006-01-21 10:36:33 UTC
On the lonelypages bug, Ashar says that the 1000 page limit is "set to make it
faster" which doesn't make any sense. How is retrieving records 1-1000 somehow
faster than 1001-2000 out of a single table? And even if this is faster, I am
willing to wait around for a special page if that helps me get some actual work
done.

Also, for wikibooks it's not as simple as for WP. On wikibooks you're working on
a specific subject area/book and you should not be categorizing other books'
pages without knowledge of their conventions, etc. So, if I want to find the
_cookbook_ pages that are uncategorized, I just can't, because they are further
down than 1000. 
Comment 3 Kellen 2006-06-19 13:19:33 UTC
Is anybody going to take this one on or at least comment on the bug? This seems
like something that would be easy to fix.
Comment 4 Rob Church 2006-06-19 15:34:04 UTC
The issue is about raising the default limit on cached special page queries to
increase the size of the set. While it's trivial to tweak, the question is - do
we want to, and how much of a performance hit (we've got to run the queries
periodically, remember, and they take time) are we looking at? So someone
involved in Wikimedia server administration has to make a decision.
Comment 5 Kellen 2006-06-19 17:44:25 UTC
For wikibooks, could we just turn off the limiting completely? The special pages
aren't heavily used and we only have ~15,000 modules. Alternatively, could we
turn off caching? With the relatively small number of modules and categories I
doubt it would be a huge performance hit. Right now Uncategorizedpages is
basically useless for wikibooks. 
Comment 6 Rob Church 2007-01-01 09:57:03 UTC
*** Bug 8450 has been marked as a duplicate of this bug. ***
Comment 7 Judson (enwiki:cohesion) 2007-01-01 15:11:16 UTC
This is somewhat important for images also, since the toolserver is down we have
no way of knowing which images are completely untagged. The uncategorized can
act as a proxy for this usually. We haven't really been keeping up with this,
but the number of them is probably pretty high now, seeing as 1000 barely gets
through the b's. For this application though we wouldn't need it biweekly,
bimonthly or even monthly would be fine.
Comment 8 Riley Lynch 2007-01-22 21:50:54 UTC
I submitted a patch for Bug 2415 which could also fix this problem if the LIMIT
is turned off for this query.

The LIMIT may not substantially reduce the cost of the cache-building queries
since it is only applied after the "heavy lifting" (full table scans, joins,
etc). This can be confirmed by comparing the stats after running the queries
with and without the LIMITs. (Note that the cache-building query LIMIT is
indirectly making reads on the querycache less expensive -- because it is
keeping it small.)

(See also Bug 4699 for a discussion of the problems of using LIMIT.)
Comment 9 JeLuF 2007-09-01 08:08:04 UTC
No, the size can't be increased. It's not only the query that's expensive, but the insert is expensive, too. Use the toolserver for such requests.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links