Last modified: 2011-03-13 18:04:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 2263 - ln.wikipedia.org is using UTF-8 encoding which lacks some caracters
ln.wikipedia.org is using UTF-8 encoding which lacks some caracters
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
Language setup (Other open bugs)
unspecified
PC Linux
: Lowest normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-05-29 17:53 UTC by Denis Jacquerye
Modified: 2011-03-13 18:04 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Denis Jacquerye 2005-05-29 17:53:09 UTC
Unicode has only approximations of some accented characters used in Lingala.
An encoding such as AFRLIN-104-BPI_OCIL fit the needs of the lingala language
perfectly.
See http://www.progiciels-bpi.ca/tcao/apercu.html#h-an3

Unfortunately the AFRLIN-104-BPI_OCIL has very low support, only free recode
currently supports it. Some people have mentionned Windows and Linux support for
it but I haven't had access to it yet.
Comment 1 River Tarnell 2005-05-29 18:07:40 UTC
this appears to be a bug in Unicode rather than the website :-) 
 
has the Unicode standards body refused to integrated support for this 
language, or has it not been proposed, or...? 
Comment 2 Denis Jacquerye 2005-05-29 18:11:03 UTC
I don't know what is the status of those characters in Unicode, but I know for
sure that they aren't included.
I'll look into asking Unicode to be compatible with that encoding.
Comment 3 Domas Mituzas 2005-05-29 18:13:44 UTC
please notify unicode working groups about missing characters. they should be dealing with unicode covering all charsets. 
Comment 4 Denis Jacquerye 2005-05-29 18:17:49 UTC
Ok, I'll try to push this to Unicode.

In the mean time ln.wikipedia.org is using an approximation of some accented
characters.
Comment 5 Denis Jacquerye 2005-12-05 13:51:28 UTC
Unicode will not add precomposed characters for several reasons. The first one
being that they are supposed to be composed as we do know. It is Unicode's
policy to only add characters that cannot be composed with existing characters.
This means that fonts and applications should be able to display composed
characters correctly, and applications should allow keyboard entry to be similar
to that for precomposed characters. 
The other reason is simply that the encoding previously mentioned is not really
in use, so it is not valid for inclusion as a legacy encoding. For such
precomposed characters to be included, countries needing them should develop
standardized encoding and push them in Unicode. I'm not in the position of doing so.

Composing characters with Unicode is the best thing around, as long as the apps
support it. Pango should support it as long as the font does it (it's in CVS) so
any gtk+ based application should do that too. The new Qt is pretty decent with
diacritics regardless of the font. Mac OS X handles it very well and Windows can
too with Uniscribe.

This can be left as a wontfix or even as closed since it's not a mediawiki
issue, unless we want to support the AFRLIN-104-BPI_OCIL encoding. But I'd
rather see applications and fonts handling Unicode properly.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links