Last modified: 2007-09-07 10:34:53 UTC
Upon installation, Special:Statistics gave correct information for the number of "probably legitimate content pages" (0). During the course of customization, I renamed a few of the namespaces, and (probably more importantly) moved the Main Page to the Project: namespace (here renamed Meta:) and altered MediaWiki:Mainpage to point there as well. I then deleted the redirect created by the move. After these customizations, the Special:Statistics page gave the following: There are 1,279 total pages in the database. This includes "talk" pages, pages about Æ, minimal "stub" pages, redirects, and others that probably don't qualify as content pages. Excluding those, there are 18,446,744,073,709,551,615 pages that are probably legitimate content pages. Obviously I didn't have 18,446,744,073,709,551,615 pages to be legitimate content. I created a page in the primary namespace and checked the statistics again: There are 1,280 total pages in the database. This includes "talk" pages, pages about Æ, minimal "stub" pages, redirects, and others that probably don't qualify as content pages. Excluding those, there are 0 pages that are probably legitimate content pages. Fine. Now I deleted the page and re-checked; I got the first message again, verbatim. I undeleted the page and the message did not change. That is to say, after the delete I received . . . There are 1,279 total pages in the database. This includes "talk" pages, pages about Æ, minimal "stub" pages, redirects, and others that probably don't qualify as content pages. Excluding those, there are 18,446,744,073,709,551,615 pages that are probably legitimate content pages. . . . regardless of an undelete. I suppose that qualifies as two bugs, actually; let me know if you want me to re-file this report so that they can be separated. In case it matters, my PHP and MySQL versions are listed by Special:Version as the following: PHP: 4.3.2 (apache2filter) MySQL: 3.23.58
The '1279' total pages is correct, as it includes pages in other namespaces, including the Mediawiki: namespace, which is populated with all the interface text when the system is installed. Presumably the very long article account occurs because the code subtracts one from the total for the 'Main Page' as this does not count as an article. As there is no main page nor any other articles in the main space, the code subtracts one from zero, causing the counter to wrap round to the highest possible number it can hold, hence the huge number that is being displayed. I would suggest that the software is modified so it doesn't subtract 1 from the count, as the main page would be considered legitimate content in a lot of projects, and in the current WM projects where there are hundreds of thousands of articles, an extra one makes little difference.
*** Bug 7378 has been marked as a duplicate of this bug. ***
*** Bug 9192 has been marked as a duplicate of this bug. ***
Recommended fixes: a) Change the field from unsigned to signed b) Fix the UPDATE to prevent underflows below 0 when the page counters are decremented. (Perhaps having it regenerate the stats when an invalid value is detected wouldn't hurt.)
Doesn't (b) make (a) unnecessary? It should never be negative, so it seems wasteful to have it signed.
(a) does not make (b) unnecessary, it would just have to detect overflows instead of underflows.
I said (b) makes (a) unnecessary, not the other way around.
Er, right, but you confused me by saying it's wasteful to have it signed. Doesn't that imply you agree with (a) too? (b) does make (a) unnecessary, strictly speaking, but there's no reason not to allow people to have more than 2,000,000,000 articles (see http://www.gaiaonline.com/, with over a billion posts . . . yikes).
I think you're reading (a) backwards as well. It is currently unsigned, Brion is suggesting making it signed.
Hah, you're right there too. Never mind, I give up. :P
Has this been fixed yet?
The sensible thing is probably to add some sanity checking into the SiteStats class lazy initialization. Currently it checks for an empty or missing row and recounts the data; checking for invalid data (negative counts or impossibly high and thus wraparound counts) could do the same. The UPDATEs for decrementing could also do a check there, maybe.
*** Bug 10600 has been marked as a duplicate of this bug. ***
(Let's keep the summary human-readable, here.)
I don't think the bug 10600 is the same mentioned in the summary. Where does the number -1396 (the number of images) in es.wikipedia.org come from?
Underflow. It's the same issue.
The number of images starts at zero. It increments when an image is uploaded and it decrements when it is erased, so that number cannot be less than zero. Why there is an underflow down to -1396 images?
Beats me, but who cares? It should be caught at display time even if you manually alter the database. Fixed in r24176.
Sanity checking is great, but we should try and keep things sane in the first place.
Reasonable, but I don't see any paths in the code that could lead to image creation without triggering site_stats updates, at a superficial look. If the problem is discovered, well and good, but until then it's best to just make sure the stats get regenerated occasionally (which I'm not sure we do, but if we don't we probably should).
*** Bug 11220 has been marked as a duplicate of this bug. ***