Last modified: 2005-11-10 07:49:14 UTC
On August 10th, 2005 I have created the article http://de.wikipedia.org/wiki/Nieder-Kainsbach. Searching for that article in the right notation "Nieder-Kainsbach" (lower/uppercase letters), the article is found I am forwarded to him. So far so good. However, searching for the term "nieder-kainsbach" (lower case letters, http://de.wikipedia.org/wiki/Spezial:Search?search=nieder-kainsbach&fulltext=Suche), the search results are strange. The page shows 3 results, the first and the second entry could be correct, the third entry, which is presumably the link to the newly created article, has no title and the relevancy is 0. As I learned from other users in the meantime, these have the same problems with newly created articles. My assumption is that new articles are not indexed or indexed only partly.
I've experienced a similar problem with a 1.5RC4 installation. Our tests indicated that the search function takes the hyphen "-" as a boolean expression. It seems to search for the first expression minus the one after the hyphen "-". Searching the MySQL database directly finds everything as expected. If that's the case the search function should only take a hyphen (or plus or... whatever) as a boolean expression if their is a blank space before it. If a hyphen is used in a word like the example above it should take it as a word.
Comment 1 seems to be incorrect. I do receive a match on the page "Silly-Stuff" eg at: http://test.leuksman.com/view/Special:Search?search=silly-stuff&fulltext=Search while with a blank space to make the minus mean discarding the term, I don't receive the result: http://test.leuksman.com/view/Special:Search?search=silly+-stuff&fulltext=Search Back to the original comment: indeed, capitals after hyphens are not tried in go search currently. Wikipedia does not use MediaWiki's default search implementation due to performance limitations, it's using a separate index which is only periodically refreshed, so will generally not include recently created articles at this time.
For the german-speaking community this bug is of severity critical and priority high, because lot of words in the german language are composed of term hyphen term.
"Critical" and "high priority" would be appropriate for something like data corruption, security risks, spewing garbage instead of text during page rendering, etc. It's not appropriate for 'it's slightly less convenient to find something in search'. Please don't fiddle with the priority tags; they're here for us to prioritize our work. However I am working on this, as some common cases should be easy to support.
Various mixed-case complicated cases will continue to not work for now, of course, but Basic-Stuff Like-This should work now, such as the example. Resolving FIXED.