Last modified: 2007-10-30 19:12:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 10130 - Need more appropriate link trail for Farsi language
Need more appropriate link trail for Farsi language
Status: RESOLVED DUPLICATE of bug 11813
Product: MediaWiki
Classification: Unclassified
Internationalization (Other open bugs)
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
  Show dependency treegraph
Reported: 2007-06-04 13:21 UTC by Huji
Modified: 2007-10-30 19:12 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Huji 2007-06-04 13:21:02 UTC
In English environment of MediaWiki (like in English Wikipedia) one can simply create a link to an article titled "example" with the link text being "examples", using this code:


In Farsi environment though, this doesn't work correctly. The following code:


will create a link to an article titled "فلان", but the "ی" is not shown as part of the link text; instead it is shown separately next to the link text (the link text will remain فلان indeed).

I guess the parser uses a regular expression (haven't checked the code though) to see if the character comming next to the closing brackets (]]) is from a-z A-Z 0-9 or not. If not, it will treat them as a space, and would not use them in the link text. I'm not sure about this assumption though.

Your help is appreciated.
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-06-04 17:52:56 UTC
Your guess is basically correct, except that this *is* localized.  The regex for English is:


which is the 'linktrail' message in MessagesEn.php.  Try providing a correct variant of that (amending the a-z character class) for Farsi.  You probably want to use the u modifier, i.e.,

/^([a-zlots of farsi characters]+)(.*)$/sDu
Comment 2 Huji 2007-06-04 18:07:28 UTC
What is wrong with:


I think this method should work with all non-space characters.
Comment 3 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-06-04 22:11:02 UTC
\s matches only ASCII space characters, I believe, so \S will match things like non-breaking space, zero-width space, etc.  Some languages also don't use spaces, so if you embed a link in a block of Chinese text (which may happen even on fa-wiki or wherever on user talk pages or something) it will link the whole block.  Also, languages with a lot of agglutination may frequently want to link only part of a word.  Then you get into weirdness with control characters and such.

A more general solution would definitely be nice, but for now best to just do something Farsi-specific.
Comment 4 Huji 2007-10-30 19:12:42 UTC

*** This bug has been marked as a duplicate of bug 11813 ***

Note You need to log in before you can comment on or make changes to this bug.