Last modified: 2014-09-20 01:14:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49029, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 47029 - '幺' => '么' in ZhConversion.php is wrong.
'幺' => '么' in ZhConversion.php is wrong.
Status: NEW
Product: MediaWiki
Classification: Unclassified
Language converter (Other open bugs)
1.22.0
All All
: Low minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-09 01:40 UTC by zoglun
Modified: 2014-09-20 01:14 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description zoglun 2013-04-09 01:40:39 UTC
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/ZhConversion.php?view=markup&pathrev=100226

In 10160

'幺' => '么', is a wrong converter. Zh wikipedia fix this by using http://zh.wikipedia.org/wiki/MediaWiki:Conversiontable/zh-hans#.E5.85.B6.E4.BB.96_2 .I am not sure why don't they update ZhConversion.php, but keep using the low effeciency database conversion.

By the way, how can I fix these problem by myself? I mean is it possible for me to change Mediawiki's git source code?
Comment 1 Sam Reed (reedy) 2013-04-09 01:45:45 UTC
(In reply to comment #0)
> By the way, how can I fix these problem by myself? I mean is it possible for
> me
> to change Mediawiki's git source code?

Yes, you can create an account so you can submit code updates. Start at https://www.mediawiki.org/wiki/Developer_access
Comment 2 Liangent 2013-04-09 10:04:09 UTC
https://www.google.com/search?q=%22%E4%BB%80%E5%B9%BA%22+site%3Atw&ie=utf-8&oe=utf-8&

It seems this rule is still needed in some cases ... on Wikipedia we can just ask editors from Taiwan "don't use '幺' in this way" but situations on different wikis are different. Also '幺' is much less used than '么' so we keep a list (see below) of words with '幺' in it where '幺' => '么' conversion is not applied.

Here's the list:

simpphrases.manual:幺厮
simpphrases.manual:幺半群
simpphrases.manual:幺元
simpphrases.manual:幺爹
simpphrases.manual:幺叔
simpphrases.manual:幺舅
simpphrases.manual:幺爸
simpphrases.manual:幺妈
simpphrases.manual:幺姨
simpphrases.manual:幺娘
simpphrases.manual:幺妹
simpphrases.manual:幺小
simpphrases.manual:幺姓
simpphrases.manual:姓幺
simpphrases.manual:幺氏
simpphrases.manual:幺蛾子
simpphrases.manual:幺麽
simpphrases.manual:幺麽小丑
simpphrases.manual:幺凤
simpphrases.manual:幺二三
simpphrases.manual:幺篇
simpphrases.manual:幺谦
Comment 3 zoglun 2013-04-09 21:51:40 UTC
I think for an unusual simple to traditional convert, it is better to write those few  words out, instead of maintenance a long convert list.

like:
幺 do not convert
simpphrases.manual:什么 => 什幺

It is the same solution you guys do with 发. 发 in simple could either means 發 and 髪 in traditional. It wrote 发 =>發 and then 头发 =>頭髪 in ZhConversion.php because people use 發 more than 髪.
Comment 4 Liangent 2013-04-10 05:18:24 UTC
(In reply to comment #3)
> I think for an unusual simple to traditional convert, it is better to write
> those few  words out, instead of maintenance a long convert list.
> 
> like:
> 幺 do not convert
> simpphrases.manual:什么 => 什幺
> 
> It is the same solution you guys do with 发. 发 in simple could either means 發
> and 髪 in traditional. It wrote 发 =>發 and then 头发 =>頭髪 in ZhConversion.php
> because people use 發 more than 髪.

Are you talking about zh-hans to zh-hant conversion or the other way?

'幺' => '么' exists in $zh2Hans conversion in ZhConversion.php.

Anyway can you say your sentence which fails with current rules?
Comment 5 zoglun 2013-04-10 05:46:20 UTC
I am talking , zh-hant to zh-hans for '幺' => '么'. I use the 发 in zh-hans to zh-hant as example.


 Currently '幺' means '么' is very rare in traditional Chinese, even zh wikipedia forbid '幺' => '么' convert. So this word should not convert to '么' in mediawiki source code.

什么 => 什幺 and 什幺 => 什么 should be add for both zh-hant to zh-hans and zh-hans to zh-hant, instead of the long 幺 not change list.
Comment 6 Liangent 2013-04-10 07:19:56 UTC
(In reply to comment #5)
> I am talking , zh-hant to zh-hans for '幺' => '么'. I use the 发 in zh-hans to
> zh-hant as example.
> 
> 
>  Currently '幺' means '么' is very rare in traditional Chinese, even zh
> wikipedia
> forbid '幺' => '么' convert. So this word should not convert to '么' in
> mediawiki
> source code.
> 
> 什么 => 什幺 and 什幺 => 什么 should be add for both zh-hant to zh-hans and zh-hans
> to
> zh-hant, instead of the long 幺 not change list.

幺 looks like just a 异体字 = "variant characters" of 么 in zh-hant so it's not often seen (correct me if I'm wrong). However it's difficult to list all usage of 么 in Chinese. How would you want to add a rule for the following sentence: 你认识zoglun么? when it's written in zh-hant with 么 written as 幺?

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links