Last modified: 2014-07-02 14:49:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T15712, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 13712 - Install Josa extension parser function on Korean Wikipedia
Install Josa extension parser function on Korean Wikipedia
Status: NEW
Product: Wikimedia
Classification: Unclassified
Extension setup (Other open bugs)
unspecified
All All
: Normal enhancement with 3 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n
Depends on:
Blocks: 31235
  Show dependency treegraph
 
Reported: 2008-04-12 13:37 UTC by Ficell
Modified: 2014-07-02 14:49 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch for Hanp.body.php by Ficell (4.65 KB, patch)
2008-09-22 14:33 UTC, Ficell
Details

Description Ficell 2008-04-12 13:37:33 UTC
I request to add new parser function #hangul, for Korean Wikipedia.

In Korean, the particle has different form according to if a before letter has jongseong(batchim). (For example, 를 (reul) is used only after a word ending in a vowel. If the preceding word ends in a consonant, 을 (eul) is used instead. For more information, see [[w:Hangul#Syllabic blocks]].) To solve this problem, we need new paser function.


Detail:

{{#hangul:AB|CD|EF}}

If a last letter of "AB" has jongseong, "CD" is returned.
If has not jongseong, or is not hangul, "EF" is returned.
Comment 1 Ficell 2008-04-15 10:38:40 UTC
More technical information:

(in unicode)
Hangul: U+AC00 ~ U+D7A3
Hangul what have not jongseong: U+AC00 + 28(0x1C)*n (U+AC00, U+AC1C, U+AC38, U+AC54, ..., U+D76C, U+D788)
Comment 2 Kyungjoon Lee 2008-04-21 18:42:38 UTC
This is a case of morphophonology.

It's as if you had to type "a/an 'insert noun here'" all the time because you can never know beforehand whether the noun will start with a vowel or a consonant.

So {{#a or an:noun|a|an}} is what Ficell is proposing, I guess.
Comment 3 Kyungjoon Lee 2008-04-21 18:53:28 UTC
According to [[w:en:Korean language#Morphophonemics]], we will need to test for an additional case.

* Preceding syllable ends with a consonant
* Preceding syllable ends with a rieul consonant
* Preceding syllable ends with a vowel (no consonant)
Comment 4 Kyungjoon Lee 2008-04-21 19:12:02 UTC
Oops, that's [[Korean language#Morphophonemics]].

IMHO it would be nicer if {{#hangul:AB|CD|EF|GH}} returned ABCD, ABEF or ABGH.
Comment 5 Ficell 2008-04-30 05:51:04 UTC
I discussed this with Kyungjoon Lee simply at the Korean Wikipedia's user talk page. And I suggest following:

(Cf [[Korean language#Morphophonemics]])

(in unicode)
Hangul: U+AC00 ~ U+D7A3
Hangul which ends with vowel: U+AC00 + 28(0x1C)*n (U+AC00, U+AC1C, U+AC38, U+AC54, ..., U+D76C, U+D788)
Hangul which ends with rieul: U+AC08 + 28(0x1C)*n (U+AC08, U+AC24, U+AC40, U+AC5C, ..., U+D774, U+D790)

{{#hanp:AB|CD}} (hanp is abbreviation of hangul particle)

* When CD is '로'(ro) or '으로'(euro)
** if a last word of AB ends with consonant(jongseong) except rieul, returned 'AB으로'(ABeuro)
** if a last word of AB ends with vowel or rieul, returned 'AB로'(ABro)
** if a last word of AB is not hangul, returned 'AB로'(ABro)
* When CD is '을'(eul), '이'(i), '와'(wa), '은'(eun) or '를'(reul), '가'(ga), '과'(gwa), '는'(neun)
** if a last word of AB ends with consonant, returned 'AB을'(ABeul), 'AB이'(ABi), 'AB와'(ABwa), 'AB은'(ABeun)
** if a last word of AB ends with vowel, returned 'AB를'(ABreul), 'AB가'(ABga), 'AB과'(ABgwa), 'AB는'(ABneun)
** if a last word of AB is not hangul, returned 'AB를'(ABreul), 'AB가'(ABga), 'AB과'(ABgwa), 'AB는'(ABneun)
Comment 6 Kyungjoon Lee 2008-05-01 09:41:47 UTC
Yeah, this is how Korean LaTeX macros handle "automatic particle handling" as well.

I think Ficell has wa/gwa switched; the Wikipedia table has the correct choices.
Comment 7 Siebrand Mazeland 2008-08-18 21:53:54 UTC
Isn't this something that could (should?) be added to language/classes/LanguageKo.php?

CC-ing Niklas in.
Domain: MediaWiki extensions/ParserFunctions -> MediaWiki/i18n
Comment 8 Niklas Laxström 2008-08-19 05:55:06 UTC
Could use grammar functionality here, with syntax something like {{GRAMMAR:hanp|AB,CD,EF,GH}} or {{GRAMMAR:hanp:CD,EF,GH|AB}}.
Comment 9 Ficell 2008-08-23 15:28:09 UTC
(In reply to comment #7)
> Isn't this something that could (should?) be added to
> language/classes/LanguageKo.php?
> 
> CC-ing Niklas in.
> Domain: MediaWiki extensions/ParserFunctions -> MediaWiki/i18n
> 

Yes, this is. I think so.

(In reply to comment #8)
> Could use grammar functionality here, with syntax something like
> {{GRAMMAR:hanp|AB,CD,EF,GH}} or {{GRAMMAR:hanp:CD,EF,GH|AB}}.
> 

It's also good ideas, but I think {{#hanp:}} is better to use.
Comment 10 Niklas Laxström 2008-08-23 16:28:27 UTC
If not grammar, would this new tag be in MediaWiki proper, piggyback an existing extension or be in a new extension?
Comment 11 Ficell 2008-08-30 11:20:08 UTC
(In reply to comment #10)
> If not grammar, would this new tag be in MediaWiki proper, piggyback an
> existing extension or be in a new extension?
> 

A new extension seems better, although I don't know detail of MediaWiki software.
Comment 12 Niklas Laxström 2008-09-21 07:33:33 UTC
I've committed an extension that should work like described in comment #5 and #c6 as r41088. It should be easy to review it because it is very small. It might be a good idea to make a new bug request specifically for enabling that extension on Korean projects.
Comment 13 Ficell 2008-09-22 14:33:06 UTC
Created attachment 5358 [details]
Patch for Hanp.body.php by Ficell

Thanks for your working, Niklas Laxström. Unfortunately I found some problem. If $word contains signs, it doesn't work well. If we want know whether '[[A]]' + 'eul' is correct or not, we can't get result with current #HANP function, because $word ends with ']' sign that we don't read. To solve this problem, I suggest adding new parameter named "output". I made patch for hanp.body.php. Please consider this.
Comment 14 Siebrand Mazeland 2008-10-27 23:47:33 UTC
(In reply to comment #13)
> Created an attachment (id=5358) [details]
> Patch for Hanp.body.php by Ficell

Wow, that was an extremely crappy patch. I had to merge that manually, line by line. Please create a proper patch next time.

Applied in r42700. How does it work now?
Comment 15 Ficell 2008-11-01 11:33:20 UTC
Sorry. I didn't know how to make proper diff file; it now works well. Thanks.
Comment 16 Siebrand Mazeland 2008-11-01 12:06:21 UTC
Changed topic and added keywords to request installation of this extension. Should this be installed for all Korean Wikimedia projects?
Comment 17 Ficell 2008-11-02 10:49:33 UTC
Yes. Please install the extension.
Comment 18 Kyungjoon Lee 2008-11-02 11:14:25 UTC
Hang on, please.

Has this extension been tested anywhere? Would it be OK to put it on a "production" server?
Comment 19 Siebrand Mazeland 2008-11-02 11:38:15 UTC
(In reply to comment #18)
> Has this extension been tested anywhere? Would it be OK to put it on a
> "production" server?

Well, that is why it had a need-review keyword. *You* can be a reviewer, but a Wikimedia developer will also audit it before it will ever go live.
Comment 20 Ficell 2008-11-07 16:29:49 UTC
(In reply to comment #18)

I tested my personal wiki. It works well. Actually I found some problems when using in system message, but it isn't the problem of the function itself. It seems no problem so far.
Comment 21 Siebrand Mazeland 2008-11-07 16:31:16 UTC
(In reply to comment #20)
> (In reply to comment #18)
> 
> It works well. Actually I found some problems when
> using in system message, but it isn't the problem of the function itself.

Please provide details so Niklas can assess if it can be fixed.
Comment 22 Ficell 2008-12-11 16:49:42 UTC
(In reply to comment #21)

Sorry for late. I was busy in real life. I'll post it in Betawiki ASAP.
Comment 23 Ficell 2009-02-14 14:38:23 UTC
Including this feature among MediaWiki core seems better. If this feature used in default MediaWiki system message, Korean translation will be more precise.

Also refer http://translatewiki.net/w/i.php?title=Support&oldid=926697#Parameter_on_log_message
Comment 24 Siebrand Mazeland 2009-02-14 14:42:03 UTC
(In reply to comment #23)
> Including this feature among MediaWiki core seems better. If this feature used
> in default MediaWiki system message, Korean translation will be more precise.
> 
> Also refer
> http://translatewiki.net/w/i.php?title=Support&oldid=926697#Parameter_on_log_message
> 
Well, that is interesting. In comment 11 you stated the opposite. What's it gonna be and why exactly?
Comment 25 Ficell 2009-02-15 04:35:51 UTC
(In reply to comment #24)

I didn't know the difference at that time. Sorry for that.
Comment 26 Mike.lifeguard 2009-03-19 18:31:09 UTC
Removed shell keyword since there's nothing to do on shell.

Removed need-review keyword since this has been applied in SVN already and/or should be implemented as a localization in betawiki.

I don't even think there is anything left in this bug to do. If there is, please point it out, otherwise it will get closed as FIXED.
Comment 27 Niklas Laxström 2009-03-19 18:36:01 UTC
{{#HANP:}} is not in core. It is currently an extension.
Comment 28 Brion Vibber 2009-03-19 18:36:53 UTC
Assigning to myself for review.

(I can't imagine what betawiki would have to do with this... an extension couldn't be used for core localizations since it wouldn't be available in default installations.)
Comment 29 Ficell 2009-03-22 13:49:58 UTC
(In reply to comment #28)

I meant this should be function in core, like {{plural:}}, not extension.
Comment 30 p858snake 2011-04-30 00:10:17 UTC
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Comment 31 Niklas Laxström 2011-09-09 11:59:50 UTC
If there is interest, I can easily port this to core. Please let me know that you need this.
Comment 32 Siebrand Mazeland 2011-09-09 12:08:13 UTC
It's been dragging for 2.5 years now. Whatever leads to an acceptable resolution, I'd say.
Comment 33 Ficell 2011-09-20 11:02:39 UTC
We need the function like this way while translating MediaWiki messages. When English words translated into Korean, the latest alphabet (in Korean) would be consonant or vowel. It isn't distinguished in English, but it is in Korean; the particle is transformed because this...

If we don't use these function, we must write whole of possible particles. (and now we do it...) It is inefficient and ugly.

Sorry for my poor English ;)
Comment 34 matanya 2012-07-23 07:37:52 UTC
Niklas/Brion, as I understand it is needed. can one of you port it please?
Comment 35 Chong-Dae Park 2013-04-12 05:08:37 UTC
FYI: This function is implemented as lua in ko.wikipedia.

https://ko.wikipedia.org/wiki/Module:Hangul
Comment 36 Andre Klapper 2013-11-12 13:10:45 UTC
(In reply to comment #28 by Brion)
> Assigning to myself for review.

Brion: As you wrote this in 2009, is that still the case, or would you like to reset the assignee to default?
Comment 37 Brion Vibber 2013-11-12 16:09:50 UTC
Presumably this is no longer active, no. :) Reassigning to default.
Comment 38 Nemo 2014-02-17 18:38:02 UTC
The extension is currently at https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FHanp (I come from there).

(In reply to Niklas Laxström from comment #31)
> If there is interest, I can easily port this to core. Please let me know
> that you need this.

Should this bug moved to core then? Seems so.
Comment 39 JuneHyeon Bae (devunt) 2014-06-12 10:58:41 UTC
There is a https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FJosa too. This is improved version of Extension:Hanp, and maintained by native Korean speaker.
And we also have some consensus in local wiki community: https://ko.wikipedia.org/wiki/%EC%9C%84%ED%82%A4%EB%B0%B1%EA%B3%BC:%EC%82%AC%EB%9E%91%EB%B0%A9_%28%EA%B8%B0%EC%88%A0%29/2014%EB%85%84_6%EC%9B%94#.EC.A1.B0.EC.82.AC_.ED.99.95.EC.9E.A5.EA.B8.B0.EB.8A.A5_.EB.8F.84.EC.9E.85

btw, I agree with integrating this feature with core.
Comment 40 Nemo 2014-06-12 11:21:01 UTC
(In reply to JuneHyeon Bae (devunt) from comment #39)
> There is a https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FJosa
> too. 
> And we also have some consensus in local wiki community:

Ok, updated bug.
Comment 41 JuneHyeon Bae (devunt) 2014-06-12 11:48:47 UTC
However I think this feature should be integrated into core.
Comment 42 Sam Reed (reedy) 2014-06-30 20:47:11 UTC
-shell, this needs reviewing for deployment etc...
Comment 43 Bawolff (Brian Wolff) 2014-06-30 20:56:24 UTC
This could also easily be accomplished with lua, no extension required
Comment 44 Sam Reed (reedy) 2014-06-30 20:58:58 UTC
Can someone clarify what actually needs doing here?

Do we want both hanp and josa installing? Just Josa? One into core? Both into core?

Either way, Josa needs some major cleanup. There's a lot of code duplication, and it's all in global functions (for starters). That'd need doing as part of moving it to core too...
Comment 45 Sam Reed (reedy) 2014-06-30 20:59:28 UTC
(In reply to Bawolff (Brian Wolff) from comment #43)
> This could also easily be accomplished with lua, no extension required

(In reply to Chong-Dae Park from comment #35)
> FYI: This function is implemented as lua in ko.wikipedia.
> 
> https://ko.wikipedia.org/wiki/Module:Hangul

RESOLVED FIXED? ;)
Comment 46 Nemo 2014-06-30 21:19:16 UTC
Only Josa, not Hanp. Modules are not very portable, but it's ko.wiki's call whether they're satisfied or not. Core or not doesn't matter so much, the code refactoring needed per above would be the same wouldn't it?
Comment 47 Revi 2014-06-30 22:06:36 UTC
(In reply to Sam Reed (reedy) from comment #45)
> (In reply to Bawolff (Brian Wolff) from comment #43)
> > This could also easily be accomplished with lua, no extension required
> 
> (In reply to Chong-Dae Park from comment #35)
> > FYI: This function is implemented as lua in ko.wikipedia.
> > 
> > https://ko.wikipedia.org/wiki/Module:Hangul
> 
> RESOLVED FIXED? ;)

ko.wiki's consensus in comment 39 was to implement Extension, not lua. And it looks like lua is not used much.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links