Last modified: 2013-11-21 14:29:23 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T40674, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 38674 - Names.php includes RTL control characters
Names.php includes RTL control characters
Status: NEW
Product: MediaWiki
Classification: Unclassified
Internationalization (Other open bugs)
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n
Depends on: 41103
Blocks: rtl
  Show dependency treegraph
Reported: 2012-07-25 15:37 UTC by Amir E. Aharoni
Modified: 2013-11-21 14:29 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Amir E. Aharoni 2012-07-25 15:37:05 UTC
Names.php includes RTL control characters in the names of some languages. This is needed for correct display of punctuation; see for example Bug 26208.

But this also introduces some problems. Most importantly, this affects the sorting order of language names. Also:

* The code of Names.php is ugly.
* Mixing Unicode invisible control characters with HTML bidi markup is a bad practice according to W3C.

Unfortunately, immediately removing these control characters is problematic, because the correct display of language names in many parts of the code depends on it. One example is language selection dropdowns. This will be probably a kind of a tracking bug for removing such problems.
Comment 1 Niklas Laxström 2013-08-28 17:30:02 UTC
This bug is valid, but it is lacking steps to go forward.

What needs to be done so that this bug can be fixed? New interfaces? Auditing all existing code?
Comment 2 Philippe Verdy 2013-11-21 06:09:01 UTC
unbelievabe ! There's a <bdi> element if you want to safely embed a foreign language in an unknown script so that you can place after this element a punctuation on the right side.
There's also CSS style "unicode-bidi:" values for directional control if you want full embedding.
There are also quirks possible in previous versions of HTML4 (it requires using absolute positioning as ain a relative element container).
Comment 3 Derk-Jan Hartman 2013-11-21 10:16:23 UTC
Philippe Verdy: The problem was mostly that we simply didn't know where all of these language names were being used (and we needed a quick fix). It wasn't known if we could safely add html into them. So we used direction chars as a 'patch' and at the same time opening this ticket.

1: Split the table of names into fragments + direction info. Use toHtml method to build a Html fragment rework the code to safely use it.
2: Simply add <bdi> tags into the name and cleanup all uses to make sure this is possible
3: Identify all places where the names are used and do individual markup/styling for all of those locations to fix the problem.

Note You need to log in before you can comment on or make changes to this bug.