Last modified: 2007-07-13 18:49:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 920 - Transliterated umlauts in the search field won't resolve
Transliterated umlauts in the search field won't resolve
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
PC All
: Normal normal with 14 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: 7002 (view as bug list)
Depends on:
  Show dependency treegraph
Reported: 2004-11-20 17:28 UTC by Denis Grelich
Modified: 2007-07-13 18:49 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Denis Grelich 2004-11-20 17:28:57 UTC
If I enter a term containing umlauts in the search field on the left, but transliterate the 
umlauts, the action fails and I am presented the search page, if there is no redirection for 
that page. On the english Wikipedia most of the time there are such redirects. For the german 
Wikipedia there would be not much sense to it.

Goedel (for Gödel) fails on; on it resolves correctly.
Godel resolves on the english page too.
Comment 1 Denis Grelich 2004-11-20 17:31:40 UTC
Would it be possible to resolve transliterated umlauts automatically to the correct character? It surely 
wouldn't break anything.
Comment 2 Andreas Franke 2005-12-16 15:30:57 UTC
Automatically adding the reverse-transliterated umlauts to the search results
would be
desirable in my opinion, in particular on .
For example, entering "kuenstliche intelligenz" in the search box there
came up with the movie "A.I. – Künstliche Intelligenz", but not with the 
main entry 
which I was only able to find via the entry for the "AI" acronym.
Comment 3 Ibn Battuta@WP 2007-05-27 00:10:10 UTC
It would be nice to add more than just the umlauts and to more than just the German Wikipedia: The same (or worse) problem occurs on any Wikipedia that uses the Latin alphabet with special characters: The Spanish, Portuguese, French, Scandinavian (...), Slavic (... ... ...), Turkish languages, to name just the largest groups (with obviously many subgroups). 
Comment 4 Hendrik Lönngren 2007-05-27 22:10:47 UTC
I agree with #3, and would still add to it. It would be desirable to handle both transliterated special characters and the accent- and featureless plain latin characters from which they have been derived as possible occurences of that special character. For example oe (common in Germany) or o (common in Sweden) for ö, or aa / a for å. I would even extend this mechanism to handling some groups of punctuation characters as one character in search, for example different quotation marks " „ “ ” « », different dashes - – —, different apostrophes ' ’ (see German article "Germany’s next topmodel"; there is a redirect from the simple version, though) etc.
Comment 5 Hendrik Lönngren 2007-05-27 22:30:23 UTC
*** Bug 7002 has been marked as a duplicate of this bug. ***
Comment 6 longthinker 2007-05-28 08:36:36 UTC
This also applies to pinyin characters (latinization of chinese characters): for example "wuji" will not find "wújí" (as in german Wikipedias article "Taiji"). Both notations are common, the former especially in printed books.
Comment 7 Robert Stojnic 2007-07-13 18:49:45 UTC
Fixed in Lucene Search 2. Accents are always stripped, and common transliterations are added as aliases (see Bug 7002). 

So, searching for Goedel should find Kurt Gödel as the first hit on both en and de wiki.

Note You need to log in before you can comment on or make changes to this bug.