Last modified: 2013-06-24 16:02:05 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T51093, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 49093 - Page title appears inside <span dir="auto">. it shouldn't (at least in RTL wikis)
Page title appears inside <span dir="auto">. it shouldn't (at least in RTL wi...
Status: ASSIGNED
Product: MediaWiki
Classification: Unclassified
Interface (Other open bugs)
1.22.0
All All
: Normal normal (vote)
: ---
Assigned To: Amir E. Aharoni
: i18n
Depends on:
Blocks: rtl
  Show dependency treegraph
 
Reported: 2013-06-03 23:32 UTC by kipod
Modified: 2013-06-24 16:02 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description kipod 2013-06-03 23:32:42 UTC
The renderer encloses the main heading of pages inside <span dir="auto">.
this causes display problems on RTL wikis with pages whose name begins with some LTR-script character, and ends with RTL chars. such names are pretty common: for instance, in hewiki, many articles about films, albums, single or songs begin with the movie/song/album name in its original language, and then has (film) or (album) in parens, in cases of ambiguity. of course, this will not be (film) but rather (סרט), and so on and so forth.

the dir="auto" causes the name to be displayed wrong if the browser understands this attribute, e.g., open the page [[he:Californication (אלבום)]]

the page title displays correctly with IE which is too stupid to understand the dir="auto" attribute, but it displays wrong with chrome or FF (which also did not understand this attribute for a long time, but its latest version when i write this, version 21, does understand it).

we should get rid of this dir="auto" in page title. it does not do any good, and it *does* have some bad consequences.

for a long time i thought about this perversion as a chrome bug, and probably other people also thought so, which is why this very old bug was not reported until now. however, when firefox became smarter and started obeying the dir="auto" directive, it became clear that chrome was right all this time, and the bug is in mediawiki page rendering.

please remove it.

peace.
Comment 1 kipod 2013-06-03 23:43:16 UTC
i added to [[he:Mediawiki:Common.js]] a line that mitigate this bug, at least for readers who have JS enables, so in order to see the problem in the page linked above, you'll need to temporarily disable JS on your browser.

reminder: the page name will display correctly for IE, but incorrectly for chrome (any) and FF (you need relatively new FF - methinks 3.6, for instance, still does not understand dir=auto, and will display the page title correctly even without JS. I kow for a fact that the problem is visible with FF 21, though i do not know how far back you need to go for it to disappear).

peace.
Comment 2 kipod 2013-06-04 00:06:36 UTC
further explanation:
since the dir="auto" attribute is given to a span inside the heading instead of being assigned to the h1 element itself, the title is still flushed to the right (in RTL wikis, or to the left in LTR wikis). this means that the reader does not know that the words written in latin script are "first" and the words written in hebrew script are "last": the hebrew word appears to the right of the latin-script word, but in RTL script this really means that the hebrew words come before the latin ones.

now, if the dir=auto was assigned to the h1 element instead of to a span inside it, the whole title would be flushed to the left, and the reader would at least stand a chance understanding that the latin-script words come first.

however, since IE still does not support this attribute, this would create serious discrepancy between the way IE users see the article and the way sane-browsers'-users see it.

so please just remove this dir="auto", and everyone will be happy.

peace.
Comment 3 Matthew Flaschen 2013-06-04 00:24:53 UTC
Amir Aharoni added this in 2fde32bed64e6dec115018a5f056f2c2af2fa883 ; I just added him as a CC.  kipod's report seems right, but I don't understand the issues well enough.
Comment 4 Daniel Friesen 2013-06-04 00:44:03 UTC
The dir=auto was added to fix bug 32403.

Maybe Gerrit change #63378 can fix both of these bugs if I omit the dir auto in it and let the new title lang dir apply.
Comment 5 Amir E. Aharoni 2013-06-04 08:01:38 UTC
I thought about it and decided to remove it. It indeed causes more problems than it solves.
Comment 6 kipod 2013-06-04 14:50:32 UTC
dir=auto may still work, but in order for it to do so, the attribute needs to be assigned to the h1 element rather than to a span within it. this way, if the page name begins with RTL char it will be flushed to the right, and if it begins with an LTR char it will be flushed to the left, making it clear to the reader how it should be read.

When the attribute is assigned to the span, it is flushed based on contnetLanguage, and when it's a mixed-case name it displays wrong.

Alternatively, we can actually look at the name in the renderer code and change the behavior "manually", based on the name: if the name is comprised of pure LTR (and "neutral") chars on RTL wiki or vice verse, we can then add "direction: ltr/rtl" to the style (or better yet, use some appropriate class, such as mw-content-ltr/mw-content-rtl).

This will solve the problem with "neutral" characters (such as parens, punctuation, etc.) acting badly when the name in its entirety is in script with opposite directionality to content language, without screwing the mixed-script names like the ones i pointed above.

setting the direction "manually", by explicitly assigning style or class to heading will handle the issue for IE users, for whom the "dir=auto" does nothing anyway, because IE does not understand it.

personally, i think the "dir=auto" property was always half assed and should be avoided: direction is handled by class or at least style, using the contemporary "direction" property, and not by the old-fashioned "dir=" attribute. the fact that the "direction" property does not support "auto" is a strong hint this should not be used.

peace.
Comment 7 Amir E. Aharoni 2013-06-04 15:19:24 UTC
(In reply to comment #6)
> personally, i think the "dir=auto" property was always half assed and should
> be
> avoided:

Of course it's half-assed and should be avoided. Explicit direction must be assigned according to the language. The problem is that we don't know the language of the title, which can be different from the language of the wiki and even from the language of the page, as you demonstrated well. One title can also have two languages, like "Californication (אלבום)"; I didn't think of this possibility when I added dir="auto", and this is the example that convinced me to remove it.

A really proper solution for the problem must involve defining the language of the page, the title and of a bunch of other things in a fine-grained way. See Bug 9360, for example.

> direction is handled by class or at least style, using the
> contemporary "direction" property, and not by the old-fashioned "dir="
> attribute. the fact that the "direction" property does not support "auto" is
> a
> strong hint this should not be used.

It is not, though I understand why do you think that it is.

Rather, it's a hint that the CSS direction attribute shouldn't be used. The "auto" value is not available in CSS, because it doesn't make sense for it to be a computed style value - the computed value must be ltr or rtl. The W3C regrets making the "direction" CSS property available to developers, because it was supposed to be only an internally-used property and it keeps it only for backwards compatibility. Though the W3C is well aware of this property's usefulness in conjunction with CSS selectors, it recommends using only the HTML dir attribute for setting the direction and letting the browser to compute the CSS direction value.
Comment 8 Daniel Friesen 2013-06-04 22:04:52 UTC
(In reply to comment #6)
> dir=auto may still work, but in order for it to do so, the attribute needs to
> be assigned to the h1 element rather than to a span within it. this way, if
> the
> page name begins with RTL char it will be flushed to the right, and if it
> begins with an LTR char it will be flushed to the left, making it clear to
> the
> reader how it should be read.

Language information shouldn't be assigned directly on the h1. We've got other code that likes to add additional things to the h1. These things can be in the user language. Adding the lang info to the h1 will apply content language to these things that could be in user language.


(In reply to comment #7)
> A really proper solution for the problem must involve defining the language
> of
> the page, the title and of a bunch of other things in a fine-grained way. See
> Bug 9360, for example.

I0ff707d5f04218bef5721e6fc162c6359bb7538a.
Comment 9 Matthew Flaschen 2013-06-04 23:59:04 UTC
(In reply to comment #7)
> A really proper solution for the problem must involve defining the language
> of
> the page, the title and of a bunch of other things in a fine-grained way. See
> Bug 9360, for example.

Is it enough for the title to have one language?  What would it be for 'Californication (אלבום)'?
Comment 10 Daniel Friesen 2013-06-05 01:08:32 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > A really proper solution for the problem must involve defining the language
> > of
> > the page, the title and of a bunch of other things in a fine-grained way. See
> > Bug 9360, for example.
> 
> Is it enough for the title to have one language?  What would it be for
> 'Californication (אלבום)'?

As long as we keep something like DISPLAYTITLE with partial html control around it should be. Since you can embed a span with the lang using displaytitle.

For example:
{{DISPLAYTITLE:<span lang="en" dir="ltr">Californication</span> (אלבום)}}

Which on the he wiki would result in (assuming my gerrit commit is modified to not override with dir=auto and then committed):
<span lang="he" dir="rtl"><span lang="en" dir="ltr">Californication</span> (אלבום)</span>

That'll get you the host language's rtl directionality and the English potion will be marked up as English with a ltr directionality.
Comment 11 kipod 2013-06-05 18:01:38 UTC
you are correct that this can be overcome by using DISPLAYTITLE, tnd this is how we did (and do) use DISPLAYTITLE in hewiki, including for direction-related fixes.

dir=auto does fix one specific issue: basically, the problem happens when a page name is in a script with directionality in the opposite direction to contentLanguage (afaik, this has noting to do with uselang), and the name begins or ends with a "neutral" charater such as punctuation (most article names do not end with period or comma, but quite a few ends with other punctuation - notable question and exclamation marks) or parenthesis and the like.

so in those cases, the "dir=auto" _did_ solve the problem, except IE does not respect this attribute, so we end up having to use DISPLAYTITLE on such pages _anyway_, with the net result that the dir=auto is insufficient sulution for the problem it tries to solve, and at the same time it causes a whole new problem, which is described in comment #1 for mixed-script names.

Since Amir agrees we'll be better off without it, i think we have a clear path to a solution.


> Is it enough for the title to have one language?  What would it be for
'Californication (אלבום)'?

so here is the deal: some songs, albums, movies, books etc. are usually referred to by their original names rather than their hebrew name. 
however, when in-name disambiguation is required, such as 
"Californication (album)" vs. "Californication (song)", we do not use "album" and "song", but rather their hebrew counterparts, אלבום and שיר.

peace.
Comment 12 kipod 2013-06-05 18:09:59 UTC
ooops - reading some more i now do think that it depends on uselang rather than contentLanguage. 

this is nitpicking really - i wish the worst bug we ever had was funky display when uselang != contentLanguage... heck, i'm willing to live with this problem forever...

if we solve the problem(s) in the common case, where uselang and contentLanguage are the same, i'll be content.

peace.
Comment 13 Daniel Friesen 2013-06-08 18:21:23 UTC
I've updated Gerrit change #63378 to use a real dir based on the title lang instead of dir=auto.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links