Last modified: 2013-09-03 18:58:41 UTC
As a follow-up to bug 22939, I'm seeing some strange behavior on the English Wikipedia: MariaDB [enwiki_p]> select * from page where page_namespace = 2 and page_title = 'Ɑʀʇʉʀɵ/SmallCaps.charset'\G *************************** 1. row *************************** page_id: 40422349 page_namespace: 2 page_title: Ɑʀʇʉʀɵ/SmallCaps.charset page_restrictions: page_counter: 0 page_is_redirect: 0 page_is_new: 1 page_random: 0.582265889095 page_touched: 20130902000030 page_latest: 571150285 page_len: 1107 1 row in set (0.09 sec) MariaDB [enwiki_p]> select * from page where page_namespace = 2 and page_title = 'ɑʀʇʉʀɵ/SmallCaps.charset'\G *************************** 1. row *************************** page_id: 8610689 page_namespace: 2 page_title: ɑʀʇʉʀɵ/SmallCaps.charset page_restrictions: page_counter: 0 page_is_redirect: 0 page_is_new: 0 page_random: 0.6887380353870001 page_touched: 20061226044311 page_latest: 96503668 page_len: 1107 1 row in set (0.10 sec) "ɑʀʇʉʀɵ/SmallCaps.charset" is an inaccessible page title. It gets normalized to "Ɑʀʇʉʀɵ/SmallCaps.charset". Presumably the previous cleanupTitles.php run would have caught this, so... I'm not sure what's up.
The majuscule form of 'ɑ' is U+2C6D Ɑ latin capital letter alpha. It was added to the Unicode standard for version 5.1, released in 2008. Up until version 5.3.3, PHP was using Unicode tables based on version 3.2 of the standard, released in 2002. When we last ran cleanupTitles.php (May 2012), we were still on PHP 5.3.2, which did not include the update. See <https://bugs.php.net/bug.php?id=52981> for more details. We should re-run cleanupTitles.php.
http://noc.wikimedia.org/~reedy/53670.log.gz
Just for the record, these pages should now exist under "Broken/". The relevant results were: $ grep "rows updated" 53670.log | grep -v "page... 0 of " arwiki: Finished page... 1 of 1416284 rows updated bewikisource: Finished page... 57 of 5972 rows updated bgwiki: Finished page... 3 of 343456 rows updated brwiki: Finished page... 3 of 96511 rows updated bswiki: Finished page... 34 of 222943 rows updated bxrwiki: Finished page... 2 of 4060 rows updated cawiki: Finished page... 1 of 1012795 rows updated cewiki: Finished page... 2 of 10910 rows updated ckbwiki: Finished page... 5 of 69631 rows updated commonswiki: Finished page... 4 of 25223432 rows updated cswiki: Finished page... 3 of 706918 rows updated cuwiki: Finished page... 1 of 4008 rows updated cywikisource: Finished page... 19 of 1104 rows updated dawiki: Finished page... 1 of 593711 rows updated dewiki: Finished page... 41 of 4537363 rows updated dewikivoyage: Finished page... 7 of 39145 rows updated diqwiki: Finished page... 17 of 18456 rows updated dvwiktionary: Finished page... 2 of 960 rows updated elwiki: Finished page... 2 of 241860 rows updated enwiki: Finished page... 157 of 31116095 rows updated enwikinews: Finished page... 1 of 731634 rows updated enwikisource: Finished page... 18 of 1447625 rows updated eowiki: Finished page... 1 of 401302 rows updated eowikisource: Finished page... 3 of 5680 rows updated eowiktionary: Finished page... 1 of 38034 rows updated eswiki: Finished page... 8 of 4325732 rows updated etwiki: Finished page... 1 of 293643 rows updated fawiki: Finished page... 3 of 1801439 rows updated fiwiki: Finished page... 3 of 886219 rows updated fiwikisource: Finished page... 310 of 12150 rows updated frwiki: Finished page... 44 of 5975233 rows updated frwikibooks: Finished page... 2 of 39266 rows updated gdwiki: Finished page... 1 of 19077 rows updated glwiki: Finished page... 1 of 230557 rows updated guwiki: Finished page... 1 of 42638 rows updated hewiki: Finished page... 1 of 629806 rows updated hsbwiktionary: Finished page... 4 of 5331 rows updated huwiki: Finished page... 3 of 835123 rows updated hywiki: Finished page... 1 of 280600 rows updated idwiki: Finished page... 3 of 1037501 rows updated idwiktionary: Finished page... 2 of 194297 rows updated incubatorwiki: Finished page... 1 of 563575 rows updated itwiki: Finished page... 4 of 3444211 rows updated jawiki: Finished page... 3 of 2418512 rows updated kbdwiki: Finished page... 3 of 3095 rows updated kowiki: Finished page... 2 of 809683 rows updated kuwiki: Finished page... 4 of 47462 rows updated kuwikibooks: Finished page... 2 of 531 rows updated kuwikiquote: Finished page... 5 of 1050 rows updated kywiki: Finished page... 1 of 36304 rows updated lawiki: Finished page... 4 of 179872 rows updated metawiki: Finished page... 3 of 2251097 rows updated mhrwiki: Finished page... 1 of 12486 rows updated minwiki: Finished page... 1 of 14535 rows updated mlwikiquote: Finished page... 1 of 3176 rows updated nowiki: Finished page... 10 of 943043 rows updated plwiki: Finished page... 6 of 1957700 rows updated ptwiki: Finished page... 5 of 3298730 rows updated ruwiki: Finished page... 13 of 3495752 rows updated sawikisource: Finished page... 1 of 11121 rows updated skwiki: Finished page... 1 of 395761 rows updated sourceswiki: Finished page... 1 of 37720 rows updated srwiki: Finished page... 2 of 689781 rows updated svwiki: Finished page... 3 of 3454807 rows updated tawikisource: Finished page... 11 of 4720 rows updated test2wiki: Finished page... 2 of 9689 rows updated tewiktionary: Finished page... 2 of 100105 rows updated thwiki: Finished page... 2 of 430332 rows updated trwiki: Finished page... 1 of 1085798 rows updated ttwiki: Finished page... 1 of 101891 rows updated ukwiki: Finished page... 6 of 1386835 rows updated ukwikisource: Finished page... 1 of 11063 rows updated urwiki: Finished page... 26 of 102498 rows updated uzwiki: Finished page... 14 of 635677 rows updated zh_yuewiki: Finished page... 1 of 78606 rows updated zhwiki: Finished page... 12 of 3086902 rows updated zhwikibooks: Finished page... 1 of 7431 rows updated
(In reply to comment #3) > Just for the record, these pages should now exist under "Broken/". Note: pages retain their namespace. For example: * (0,'ӷ') to (0,'Broken/Ӷ') * (3,'ɑʀʇʉʀɵ') to (3,'Ɑʀʇʉʀɵ') So the pages will exist under "Broken/", but it requires checking every namespace if you're using Special:PrefixIndex.