Last modified: 2014-09-23 23:08:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T26409, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 24409 - A brand new conversion core for Language Converter
A brand new conversion core for Language Converter
Status: NEW
Product: MediaWiki
Classification: Unclassified
Language converter (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n, need-unittest, patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-16 20:52 UTC by Philip Tzou
Modified: 2014-09-23 23:08 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
a initial patch (6.39 KB, patch)
2010-07-16 20:52 UTC, Philip Tzou
Details

Description Philip Tzou 2010-07-16 20:52:18 UTC
Created attachment 7571 [details]
a initial patch

I have reprogramed the core algorithm of Language Converter just now. The conversion core used an algorithm named "forwards maximum match algorithm" which was implemented in PHP by a function strtr(). The original strtr (in the C source, it's php_strtr_array) is slow because it simply finds out maxlen and minlen of all keys, then test the text to be converted from top to bottom, from long to short without any distinction.

I improved this algorithm. Fisrt I create a "quick table" to store the first char of a key and all possible length of the key. The quick table can be cached. Then I can simply check the quick table with the first char of the remain text, and just need to test all possible length of such first char. As a result, the performance improved.

Here I submit a initial patch for further testing.
Comment 1 p858snake 2011-04-30 00:09:36 UTC
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Comment 2 Bugmeister Bot 2011-08-19 19:12:14 UTC
Unassigning default assignments. http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/54734
Comment 3 Niklas Laxström 2011-09-21 08:19:08 UTC
Can you also provide test cases?
Comment 4 Sumana Harihareswara 2012-08-17 10:52:04 UTC
Philip, just for future reference, in case you want to submit patches to MediaWiki, https://www.mediawiki.org/wiki/Git/Tutorial shows you how to submit them directly into our source control system and bypass the Bugzilla step.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links