Last modified: 2013-04-26 08:49:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T33732, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 31732 - Category count of file numbers is wrong on first page
Category count of file numbers is wrong on first page
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Categories (Other open bugs)
1.18.x
All All
: Normal normal (vote)
: Future release
Assigned To: Nobody - You can work on this!
: testme
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-15 16:48 UTC by Raimond Spekking
Modified: 2013-04-26 08:49 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Raimond Spekking 2011-10-15 16:48:49 UTC
The category count of file numbers is wrong on the first page:

https://commons.wikimedia.org/wiki/Category:GFDL :

 "The following 201 files are in the current category. "

The subsequent pages show the correct number:

https://commons.wikimedia.org/w/index.php?title=Category:GFDL&uselang=en&filefrom=%22X%22.JPG#mw-category-media

 "The following 201 files are in this category, out of 3,078,868 total."

I think this is a code regression.
Comment 1 Bawolff (Brian Wolff) 2011-10-15 20:09:47 UTC
Also, that should be following 200, not 201 articles.

"The following 201 files are in the current category. "

is the correct thing to say, if we don't know how many files are in the cat in total, however we must clearly know if we know on the second page how many there are.
Comment 2 John Kettovsky 2011-11-02 20:32:03 UTC
Inconfirmed for other commons categories with > 200 members:
* https://commons.wikimedia.org/wiki/Category:GFDL
* https://commons.wikimedia.org/wiki/Category:GFDL-1.2
* https://commons.wikimedia.org/wiki/Category:GFDL-GMT

Probably this is not 1.18 issue.
Comment 3 Bawolff (Brian Wolff) 2011-11-03 04:14:02 UTC
Ok, so what's happening:

For some reason we are returning 201 results for the image gallery section of the category (instead of 200 like we should. We get 201 images so we know if to make the continue link, but we normally shouldn't display image number 201). This causes MediaWiki to detect an inconsistency in the category table (aka the total counts) and not display the total number of images in that cat. Totals are displayed on the next page, as having an offset disables much of the consistency checks (since they don't make sense in that case)

The issue is not present in __NOGALLERY__ cats (Ex [[commons:Category:Polish_pronunciation]] ).  At first glance does not seem to be categorytree related (&notree=true url parameter didn't affect it). I'm also having trouble reproducing locally (even on my 1.18wmf1 checkout, but i am using a much lower $wgCategoryPagingLimit on my local checkout)


Where this gets really weird is that issue is not present on enwikipedia - [[Category:Diagram_images_that_should_be_in_SVG_format]] (only 200 images are returned per page as expected. (but then again neither is [[commons:category:Unidentified_sunset_locations]] affected, so beats me)
Comment 4 Bawolff (Brian Wolff) 2011-11-03 04:32:48 UTC
Based on toolserver db, looks to be caused by inconsistencies in the commons db.

File:Fresco with Trompe l'oeuil - Andrea Pozzo -Jesuit Church Vienna.jpg

has page_namespace of 6 (NS_FILE) and a page_id of 2602773, but several of its categorylinks have a cl_type of "page" instead of "file":

mysql> select cl_to, cl_type from categorylinks where cl_from=2602773;
+---------------------------------------+---------+
| cl_to                                 | cl_type |
+---------------------------------------+---------+
| Andrea_Pozzo                          | page    |
| CC-BY-2.5                             | page    |
| CC-BY-SA-3.0-migrated                 | page    |
| GFDL                                  | page    |
| Jesuit_Church,_Vienna                 | page    |
| License_migration_redundant           | page    |
| Media_with_locations                  | file    |
| Quality_images                        | page    |
| Quality_images_of_Austria             | page    |
| Quality_images_of_churches_in_Austria | file    |
| Self-published_work                   | page    |
| Trompe_l'oeil_in_Austria              | page    |
+---------------------------------------+---------+
12 rows in set (0.00 sec)


Thus, when mediawiki does the query, it gets this image as part of the query for normal pages, but then sorts it in the image section since it uses page_namespace for dividing between the same section. This results in image section having more than 200 images. The is counts in category table consitant code sees that number of images returned does not equal $wgCategoryPagingLimit (The gist of the code seems to suggest < instead of != is the true condition being looked for), but that there should be more images in total than the paging limit, and no offset has been specified, so thinks that category counts are wrong.

The code should possibly handle this situation better, but I'm not entirely sure what the right way to handle it is.
Comment 5 Sumana Harihareswara 2011-11-04 22:26:28 UTC
Per IRC discussion today, removing the 1.18 milestone as this does not seem urgent enough for us to aim to fix this by the 1.18 release.
Comment 6 Aaron Schulz 2011-11-09 19:20:42 UTC
(In reply to comment #4)
> File:Fresco with Trompe l'oeuil - Andrea Pozzo -Jesuit Church Vienna.jpg
> 
> has page_namespace of 6 (NS_FILE) and a page_id of 2602773, but several of its
> categorylinks have a cl_type of "page" instead of "file":
> 

Are the those affected pages using Media: links by any chance?
Comment 7 Bawolff (Brian Wolff) 2011-11-09 22:03:28 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > File:Fresco with Trompe l'oeuil - Andrea Pozzo -Jesuit Church Vienna.jpg
> > 
> > has page_namespace of 6 (NS_FILE) and a page_id of 2602773, but several of its
> > categorylinks have a cl_type of "page" instead of "file":
> > 
> 
> Are the those affected pages using Media: links by any chance?

I'm unclear how you can have a category link in the media namespace(?).

It looks almost like a failure of the schema update script ("page" is the default cl_type if not set otherwise, and the categorylinks didn't have a cl_collation entry, cl_sortkey still has namespace, etc. )


For reference, this is what a relavent entry in the categorylinks table looked like (taken from toolserver):

          cl_from: 2602773
            cl_to: Andrea_Pozzo
       cl_sortkey: File:Fresco with Trompe l'oeuil - Andrea Pozzo -Jesuit Church Vienna.jpg
     cl_timestamp: 2011-07-07 15:01:48
cl_sortkey_prefix: 
     cl_collation: 
          cl_type: page

On the other hand 2011-07-07 15:01:48 is after the categorylinks schema update. On the other hand, the image wasn't edited anywhere remotely near that date, so I still think its just an artifact of updateCollations.php doing something wrong for one individual file.

------

Note, I did some dummy edits to [[commons:File:Fresco with Trompe l'oeuil - Andrea Pozzo -Jesuit Church Vienna.jpg]] since there's no sense in forcing commons to live with the bug while we think of what to do about it (I still feel MediaWiki should handle the situation more gracefully). To reproduce you simply have to manually change the cl_type from "file" to "page" on some categorylink entry on your test wiki.
Comment 8 Aaron Schulz 2011-11-09 22:20:25 UTC
(In reply to comment #7)
> I'm unclear how you can have a category link in the media namespace(?).
> 

Nvm, I was confusing this with two other bugs.
Comment 9 Krinkle 2012-05-05 17:19:02 UTC
(mass change)
* 1.18.0 and 1.19.0 have been released already.
* Moving open bugs targeted for 1.18.0 or 1.19.0 to Mysterious future.
* Please re-target them to 1.19.x or 1.20.0 if needed.
Comment 10 Bawolff (Brian Wolff) 2013-04-24 23:01:05 UTC
Are there any known instances of this bug? If not I suggest closing as a one time db referential integrity issue.
Comment 11 Raimond Spekking 2013-04-26 08:49:36 UTC
(In reply to comment #10)
> Are there any known instances of this bug? If not I suggest closing as a one
> time db referential integrity issue.

Looks fixed somehow now. Closing.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links