Last modified: 2010-05-15 15:42:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 8470 - Fulltext search in PostgreSQL for non latin keywords is broken
Fulltext search in PostgreSQL for non latin keywords is broken
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
PC Linux
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
  Show dependency treegraph
Reported: 2007-01-03 12:26 UTC by Evgueni
Modified: 2010-05-15 15:42 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Evgueni 2007-01-03 12:26:08 UTC
Fulltext search for non latin keywords (for example russian) is broken. No hits
for search russian words, but english words search is O'k.

PostgreSQL version is 8.1.4 (backport for GNU/Linux Debian stable (Sarge))
MediaWiki 1.8.2.

In includes/SearchPostgres.php in parseQuery function statement is used:

$searchon .= $terms[1] . $wgContLang->stripForSearch( $terms[2] );

stripForSearch() function is located in languages/Language.php

The comment in the function is following:

# MySQL fulltext index doesn't grok utf-8, so we need to fold cases and convert
to hex

But if I rewrite this function in a simple way:

function stripForSearch( $string ) {
                return $string;

russian full text search is appeared. So I think for PostgreSQL this is s solution.

P.S. (btw) PostgreSQL 8.2 comes with tsearche2 extention with full multibyte
(UTF-8) support. So it's possible to init database with unicode locales. I
checked it with ru_RU.UTF-8 locale.
Comment 1 Greg Sabino Mullane 2007-01-03 16:36:55 UTC
Thanks, applied a simple db check and quick return in r18791.

Note You need to log in before you can comment on or make changes to this bug.