Last modified: 2013-11-21 14:29:23 UTC
Names.php includes RTL control characters in the names of some languages. This is needed for correct display of punctuation; see for example Bug 26208.
But this also introduces some problems. Most importantly, this affects the sorting order of language names. Also:
* The code of Names.php is ugly.
* Mixing Unicode invisible control characters with HTML bidi markup is a bad practice according to W3C.
Unfortunately, immediately removing these control characters is problematic, because the correct display of language names in many parts of the code depends on it. One example is language selection dropdowns. This will be probably a kind of a tracking bug for removing such problems.
This bug is valid, but it is lacking steps to go forward.
What needs to be done so that this bug can be fixed? New interfaces? Auditing all existing code?
unbelievabe ! There's a <bdi> element if you want to safely embed a foreign language in an unknown script so that you can place after this element a punctuation on the right side.
There's also CSS style "unicode-bidi:" values for directional control if you want full embedding.
There are also quirks possible in previous versions of HTML4 (it requires using absolute positioning as ain a relative element container).
Philippe Verdy: The problem was mostly that we simply didn't know where all of these language names were being used (and we needed a quick fix). It wasn't known if we could safely add html into them. So we used direction chars as a 'patch' and at the same time opening this ticket.
1: Split the table of names into fragments + direction info. Use toHtml method to build a Html fragment rework the code to safely use it.
2: Simply add <bdi> tags into the name and cleanup all uses to make sure this is possible
3: Identify all places where the names are used and do individual markup/styling for all of those locations to fix the problem.