Last modified: 2014-09-24 01:22:22 UTC
If all articles in a given category have sort keys that begin with the same letter, only this letter will be displayed when the category is rendered. It would be better to display more letters, until sort keys are different. The following patch determines the length of the longest common substring, and it displays this length+1 characters. I was tested with mediawiki 1.4. Proposed patch: In file CategoryPage.php, around line 137, replace existing code with: // Page in this category array_push( $articles, $sk->makeKnownLinkObj( $title ) ) ; array_push( $articles_start_char, $x->cl_sortkey ) ; } } $root_length = 0; for ($index = 0 ; $index < count($articles) - 1; $index++){ if ($articles_start_char[$index][$root_length]!=$articles_start_char[$index+1][$root_length]) { break; } else{ if($index == count($articles) - 2) { $root_length = $root_length+1; $index=0; } } } for ($index = 0 ; $index < count($articles) ; $index++){ $articles_start_char[$index] = substr($articles_start_char[$index],0,$root_length+1); } $dbr->freeResult( $res );
Created attachment 700 [details] Here is an actual patch. it works with 1.5beta3
Created attachment 701 [details] patch generated with diff -u. patch for 1.5beta3. also, minor fix: $index has to be reset to -1 inside the loop (was 0)
(In reply to comment #2) > Created an attachment (id=701) [edit] > patch generated with diff -u. > > patch for 1.5beta3. > > also, minor fix: $index has to be reset to -1 inside the loop (was 0) Applied to HEAD after some fixes, namely using $wgContLang->truncate() instead of substr() and formatting. Marking this as FIXED.
This fix caused bug 2835, so I've backed out the change (revision 1.24 of CategoryPage.php). Reopening.
Created attachment 727 [details] new patch for utf8 strings the regression was caused by incorrect handling of utf8 strings. it should be fixed with this new patch. (I *know* this code is ugly. please bear with me)
it might be better to use the firstChar() and trim() methods on Language, to avoid duplicating ugly UTF-8 code.
Created attachment 731 [details] ok, new patch using firstChar. I hope ugliness went down to an acceptable level, although I don't see how to use trim here.
rm need-review, the patch is obsolete.
Is this rendered obsolete by the 1.17 collation rewrite?
The patch might not apply anymore, but the bug is certainly still relevant.
ThomasV, if you're interested in updating your patch to work with current trunk, please join us in the #mediawiki channel on freenode IRC to ask for feedback and discuss your approach. Thanks!
btw, a while back I made an extension that does something similar to what this bug requests (but not quite the same thing) - [[mw:extension:CategorySortHeaders]] in case anyone is interested.