Last modified: 2006-08-17 23:57:01 UTC
See test suite at link. The functions (uc, ucfirst, lc, lcfirst) are defined in languages\Language.php at lines 715 on; they basically just execute mb_strtolower <http://us2.php.net/manual/en/function.mb-strtolower.php> or -upper on the appropriate characters. At least many, probably all, of the characters in question are defined as uppercase/lowercase equivalents in the Unicode standard <http://www.unicode.org/Public/UNIDATA/CaseFolding.txt>, so should work correctly. I would have to install the mbstring extension to test locally whether the functions themselves are the problem. Can someone confirm/disconfirm whether mb_strtolower("\xC5\x98","UTF-8"); works correctly on PHP 5.1.2?
Created attachment 2241 [details] Untested one-line patch Problem clarified (thanks to Danny_B, who asked me to post this bug in the first place): the error only seems to occur when the first character of the string is ASCII. Since I can't reproduce the bug locally, I can't be sure of the solution, but it appears to me that there's an extraneous ^ in the regex of Language::isMultibyte: "return (bool)preg_match( '/[^\x80-\xff]/', $str );" should be "return (bool)preg_match( '/[\x80-\xff]/', $str );". The patch needs to be tested on an install with mbstring.
Created attachment 2242 [details] Untested one-line patch Sigh, I need to find a better text editor . . .
Fixed in r16114