Last modified: 2011-03-13 18:04:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T4263, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 2263 - ln.wikipedia.org is using UTF-8 encoding which lacks some caracters
ln.wikipedia.org is using UTF-8 encoding which lacks some caracters
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
Language setup (Other open bugs)
unspecified
PC Linux
: Lowest normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-29 17:53 UTC by Denis Jacquerye
Modified: 2011-03-13 18:04 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Denis Jacquerye 2005-05-29 17:53:09 UTC
Unicode has only approximations of some accented characters used in Lingala.
An encoding such as AFRLIN-104-BPI_OCIL fit the needs of the lingala language
perfectly.
See http://www.progiciels-bpi.ca/tcao/apercu.html#h-an3

Unfortunately the AFRLIN-104-BPI_OCIL has very low support, only free recode
currently supports it. Some people have mentionned Windows and Linux support for
it but I haven't had access to it yet.
Comment 1 River Tarnell 2005-05-29 18:07:40 UTC
this appears to be a bug in Unicode rather than the website :-) 
 
has the Unicode standards body refused to integrated support for this 
language, or has it not been proposed, or...? 
Comment 2 Denis Jacquerye 2005-05-29 18:11:03 UTC
I don't know what is the status of those characters in Unicode, but I know for
sure that they aren't included.
I'll look into asking Unicode to be compatible with that encoding.
Comment 3 Domas Mituzas 2005-05-29 18:13:44 UTC
please notify unicode working groups about missing characters. they should be dealing with unicode covering all charsets. 
Comment 4 Denis Jacquerye 2005-05-29 18:17:49 UTC
Ok, I'll try to push this to Unicode.

In the mean time ln.wikipedia.org is using an approximation of some accented
characters.
Comment 5 Denis Jacquerye 2005-12-05 13:51:28 UTC
Unicode will not add precomposed characters for several reasons. The first one
being that they are supposed to be composed as we do know. It is Unicode's
policy to only add characters that cannot be composed with existing characters.
This means that fonts and applications should be able to display composed
characters correctly, and applications should allow keyboard entry to be similar
to that for precomposed characters. 
The other reason is simply that the encoding previously mentioned is not really
in use, so it is not valid for inclusion as a legacy encoding. For such
precomposed characters to be included, countries needing them should develop
standardized encoding and push them in Unicode. I'm not in the position of doing so.

Composing characters with Unicode is the best thing around, as long as the apps
support it. Pango should support it as long as the font does it (it's in CVS) so
any gtk+ based application should do that too. The new Qt is pretty decent with
diacritics regardless of the font. Mac OS X handles it very well and Windows can
too with Uniscribe.

This can be left as a wontfix or even as closed since it's not a mediawiki
issue, unless we want to support the AFRLIN-104-BPI_OCIL encoding. But I'd
rather see applications and fonts handling Unicode properly.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links