Last modified: 2012-12-30 21:11:22 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 40821 - PostgreSQL searches do not treat Unicode full width characters as their normal counterparts
PostgreSQL searches do not treat Unicode full width characters as their norma...
Status: NEW
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
Blocks: postgres
  Show dependency treegraph
Reported: 2012-10-06 17:06 UTC by Tim Landscheidt
Modified: 2012-12-30 21:11 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Tim Landscheidt 2012-10-06 17:06:25 UTC
The search engines for MySQL and SQLite treat "AZ" (that's #xff21 and #xff3a) as "AZ" (cf. [[Halfwidth and fullwidth forms]]), PostgreSQL does not and thus fails testFullWidth().

One idea would be to TRANSLATE() them in ts2_page_text() and ts2_page_title() and use a similar technique in SearchPostgres::parseQuery().  If so, we need to describe in the release notes how to regenerate the tsvectors after an update or detect if ts2_page_text() or ts2_page_title() has changed and then regenerate them ourselves (I prefer the former).

Of course, another imaginable approach would be try to push this normalization into a text search configuration for to_tsvector(), but I don't know whether this is even possible.

Note You need to log in before you can comment on or make changes to this bug.