Last modified: 2011-03-13 18:04:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T4263, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 2263 - ln.wikipedia.org is using UTF-8 encoding which lacks some caracters


Summary:	ln.wikipedia.org is using UTF-8 encoding which lacks some caracters

Status:	RESOLVED WONTFIX

Product:	Wikimedia
Classification:	Unclassified
Component:	Language setup (Other open bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	Lowest normal with 1 vote (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2005-05-29 17:53 UTC by Denis Jacquerye
Modified:	2011-03-13 18:04 UTC (History)
CC List:	2 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Denis Jacquerye 2005-05-29 17:53:09 UTC

Unicode has only approximations of some accented characters used in Lingala.
An encoding such as AFRLIN-104-BPI_OCIL fit the needs of the lingala language
perfectly.
See http://www.progiciels-bpi.ca/tcao/apercu.html#h-an3

Unfortunately the AFRLIN-104-BPI_OCIL has very low support, only free recode
currently supports it. Some people have mentionned Windows and Linux support for
it but I haven't had access to it yet.

Comment 1 River Tarnell 2005-05-29 18:07:40 UTC

this appears to be a bug in Unicode rather than the website :-) 
 
has the Unicode standards body refused to integrated support for this 
language, or has it not been proposed, or...?

Comment 2 Denis Jacquerye 2005-05-29 18:11:03 UTC

I don't know what is the status of those characters in Unicode, but I know for
sure that they aren't included.
I'll look into asking Unicode to be compatible with that encoding.

Comment 3 Domas Mituzas 2005-05-29 18:13:44 UTC

please notify unicode working groups about missing characters. they should be dealing with unicode covering all charsets.

Comment 4 Denis Jacquerye 2005-05-29 18:17:49 UTC

Ok, I'll try to push this to Unicode.

In the mean time ln.wikipedia.org is using an approximation of some accented
characters.

Comment 5 Denis Jacquerye 2005-12-05 13:51:28 UTC

Unicode will not add precomposed characters for several reasons. The first one
being that they are supposed to be composed as we do know. It is Unicode's
policy to only add characters that cannot be composed with existing characters.
This means that fonts and applications should be able to display composed
characters correctly, and applications should allow keyboard entry to be similar
to that for precomposed characters. 
The other reason is simply that the encoding previously mentioned is not really
in use, so it is not valid for inclusion as a legacy encoding. For such
precomposed characters to be included, countries needing them should develop
standardized encoding and push them in Unicode. I'm not in the position of doing so.

Composing characters with Unicode is the best thing around, as long as the apps
support it. Pango should support it as long as the font does it (it's in CVS) so
any gtk+ based application should do that too. The new Qt is pretty decent with
diacritics regardless of the font. Mac OS X handles it very well and Windows can
too with Uniscribe.

This can be left as a wontfix or even as closed since it's not a mediawiki
issue, unless we want to support the AFRLIN-104-BPI_OCIL encoding. But I'd
rather see applications and fonts handling Unicode properly.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links