Last modified: 2012-03-01 04:18:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32149, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30149 - Handling of parentheses for Korean, Chinese and Japanese
Handling of parentheses for Korean, Chinese and Japanese
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.20.x
All All
: Unprioritized enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n, patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-31 10:53 UTC by yes0song
Modified: 2012-03-01 04:18 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
A patch to make the space before the parens in the pipe trick optional. Parsertests included, patch against /phase3/ (1.98 KB, patch)
2011-07-31 17:27 UTC, Dan Collins
Details
First one had unicode mangled, trying from a different PC (2.78 KB, patch)
2011-07-31 18:04 UTC, Dan Collins
Details

Description yes0song 2011-07-31 10:53:57 UTC
== Summary ==

Introduce new features for East Asian languages. Examples are: 
* Writing "[[Foo(bar)|]]" in the edit page and save the page, then it will be replaced with "[[Foo(bar)|Foo]]". (half-width parentheses without a space)
* Writing "[[Foo(bar)|]]" in the edit page and save the page, then it will be replaced with "[[Foo(bar)|Foo]]". (full-width parentheses without a space)


== Explanation ==

Currently, the expression below is automatically changed to provide convenience in editing.

* [[Android (operating system)|]] -> [[Android (operating system)|Android]]

However, in East Asian languages, these isn't convenient. 

In Chinese and Japanese, full-width parentheses () are usually used (and without space), such as "戦国時代(中国)".

In Korean, half-width parentheses () are overwhelmingly used like English but without space, such as "허브(식물)".

Because of technical limitation in MediaWiki, the notations above can't be used in Wikimedia projects in those languages. Titles in those projects are written like "戦国時代 (中国)" or "허브 (식물)" (half-width parentheses with a half-width space). It's not convenient for speakers of those languages.

Therefore, I suggest to add those handlings below: 

* [[Foo(bar)|]] -> [[Foo(bar)|Foo]] (for Korean)
* [[Foo(bar)|]] -> [[Foo(bar)|Foo]] (for Chinese and Japanese)

=== PS ===

Similarly, the handling for a comma is supported in MediaWiki, the examples are below:

* [[Albany, New York|]] -> [[Albany, New York|Albany]]
* [[Francis II, Holy Roman Emperor|]] -> [[Francis II, Holy Roman Emperor|Francis II]]

The comma in "Albany, New York" was used to separate parts of geographical references, and the comma in "Francis II, Holy Roman Emperor" was used to indicate identity.

There are equivalences of a comma in East Asian languages (, and 、), but in East Asian languages, they aren't used as those usages above. Thus, the treatment for East Asian commas aren't needed.
Comment 1 Dan Collins 2011-07-31 17:27:25 UTC
Created attachment 8856 [details]
A patch to make the space before the parens in the pipe trick optional. Parsertests included, patch against /phase3/
Comment 2 Dan Collins 2011-07-31 17:29:47 UTC
"[[Foo(bar)|]]" appears to correctly parse to "[[Foo(bar)|Foo]]" in latest SVN. However, "[[Foo(bar)|]]" parses to "[[Foo(bar)|Foo(bar)]]". I have attached a patch which should correct that, but the machine I wrote it on isn't actually running a wiki, so I have to pop over to linux to test it. This patch also adds two parsertests, one to test that the full width parens work, and one to test that standard parens with no space will also work. I have not added anything regarding the comma, since as the reporter said, they are used differently in those languages. I wasn't sure what to do with the [[Foo (bar), baz|]] option, should [[Foo(bar), baz|]] be allowed?
Comment 3 Dan Collins 2011-07-31 18:04:49 UTC
Created attachment 8857 [details]
First one had unicode mangled, trying from a different PC

For some reason my patch got mangled on upload. Trying again. Also, fixing some space in the parsertests, and making it legal to have [[Foo (bar)|]] (full width parens with space). This is tested, parsertests pass and looks good on my test wiki.
Comment 4 yes0song 2011-08-01 16:31:06 UTC
I'm not a developer, I can't test it. Sorry. Somebody test it, please.

I think [[Foo(bar), baz|]] -> [[Foo(bar), baz|Foo]] is not needed in East Asian languages as I wrote above.

(In reply to comment #2)
> "[[Foo(bar)|]]" appears to correctly parse to "[[Foo(bar)|Foo]]" in latest SVN.
> However, "[[Foo(bar)|]]" parses to "[[Foo(bar)|Foo(bar)]]". I have attached a
> patch which should correct that, but the machine I wrote it on isn't actually
> running a wiki, so I have to pop over to linux to test it. This patch also adds
> two parsertests, one to test that the full width parens work, and one to test
> that standard parens with no space will also work. I have not added anything
> regarding the comma, since as the reporter said, they are used differently in
> those languages. I wasn't sure what to do with the [[Foo (bar), baz|]] option,
> should [[Foo(bar), baz|]] be allowed?
Comment 5 Mark A. Hershberger 2011-08-01 17:38:41 UTC
r93633

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links