Last modified: 2012-01-01 22:58:05 UTC
I don't believe this: http://en.wikipedia.org/w/index.php?title=Special:WantedFiles&limit=500&offset=1000 which claims "There are no results for this report." :-) http://en.wikipedia.org/w/index.php?title=Special:WantedFiles&limit=500&offset=500 works ok. It seems that 1000 is presently not the max for the browsing page size (how many on one page) but for the total number of results. Since all files shown are false positives (i.e. files wanted only locally, but present on Commons), wanted files currently claims the English Wikipedia uses only 1000 files from Commons.
This is because those pages take a long time to generate, so what we do is store the top 1000 (or something) results, and save them. Thus we only have the first 1000 results stored. That's pretty unlikely to change (although we could potentially change the message to make it more clear somehow what is actually going on).
I did not realize this. But we agree that this makes WantedFiles completely unusable? I understand that the queries are expensive. I also guess that mediawiki, due to its excellent page caching has no mechanism for caching database query results, correct? Such consistency queries could, from a user standpoint, easily be updated only once a day, but then fully. Provided it is shown transparently how old the results are. But I understand that this would need a new infrastructure.
(In reply to comment #2) > I did not realize this. But we agree that this makes WantedFiles completely > unusable? Heck, just the sheer number of false positives make it totally useless > > I understand that the queries are expensive. I also guess that mediawiki, due > to its excellent page caching has no mechanism for caching database query > results, correct? Such consistency queries could, from a user standpoint, > easily be updated only once a day, but then fully. Provided it is shown > transparently how old the results are. But I understand that this would need a > new infrastructure. We actually do cache the queries and not just the output on to the page (in the querycache table). The total number of results cached is controlled by [[mw:manual:$wgQueryCacheLimit]]. Special:wantedfiles should also have the line on it: The following information is cached, and was last updated 16:04, 31 December 2011. To indicate how outdated the info is.
OK. I better understand now why the solution to just browse through (and hide the false positives) does not work. Thanks for your patience!
It does, on top it says: "The following information is cached, and was last updated 14:37, 1 January 2012."