Last modified: 2013-08-29 16:56:07 UTC
It would be desirable for Search and especially for the "Go" functionality (resolving a page title to an actual page without an intermediate search) to use all sensible Unicode Foldings on the searched titles. Unicode Character Foldings define[1] string transformations for making two strings search-equivalent (different from Unicode normalizations, which make strings content-equivalent). The folded title should not be stored instead of the original title but in addition to it, and when searching, the comparisons should be made between a folded search string and the folded title. We already do certain forms of folding, such as case insensitivity, but we could benefit from the full set of foldings, such as eliminating the difference between minus and dashes and more. [1] http://www.unicode.org/unicode/reports/tr30/
*** Bug 14180 has been marked as a duplicate of this bug. ***
De-assigning, as no activity in 3 years. Still a good idea though! :) K-form normalization would be easy to apply (since UtfNormal class already implements it); other folding may require more coding.
*** Bug 4379 has been marked as a duplicate of this bug. ***
*** Bug 20529 has been marked as a duplicate of this bug. ***
This is the same bug as in <https://translatewiki.net/wiki/Thread:Support/Search_index_should_ignore_punctuation>, isn't it?