Last modified: 2014-10-24 18:51:35 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55013, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53013 - can't find results for string ".?*/()44$$$" though it is present in a page
can't find results for string ".?*/()44$$$" though it is present in a page
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: Low normal with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-19 02:14 UTC by Sumana Harihareswara
Modified: 2014-10-24 18:51 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sumana Harihareswara 2013-08-19 02:14:33 UTC
https://test2.wikipedia.org/wiki/User_talk:Sumanah has the string 

.?*/()44$$$

but searching for that string turns up no results, even when searching all namespaces:

https://test2.wikipedia.org/w/index.php?title=Special:Search&search=.%3F*%2F%28%2944%24%24%24&fulltext=Search&profile=all&redirs=1
Comment 1 Nik Everett 2013-08-19 11:54:19 UTC
Lowering importance for now compared to the other worse problems we're seeing.  I'm not sure if we should support searching stuff like this given that we're really designed to search for words.  At this point your search is tokenized as just "44".  Everything else is thrown away.
Comment 2 Sumana Harihareswara 2013-08-19 15:46:24 UTC
I understand that we'll be primarily searching for words.  But there are English Wikipedia articles that have slashes in their titles, e.g., "/dev/null" and the results in https://en.wikipedia.org/w/index.php?search=%2Fdev%2F&title=Special%3ASearch&fulltext=1 , parentheses, e.g. https://en.wikipedia.org/wiki/%28I_Can%27t_Get_No%29_Satisfaction , question marks, e.g., ?uestlove , asterisks, e.g. *69 , and more.

Also, when someone is looking for technical help on mediawiki.org or in the help pages of the English Wikipedia (for instance, with templates), we will want to be able to help them, even if they are using {} and similar.

So I think we do need to worry about these kinds of characters.
Comment 3 Nik Everett 2013-08-19 15:53:17 UTC
Raising to normal - below being able to search for text that isn't really in the page, but above including unmentioned urls in search.
Comment 4 Nik Everett 2013-10-09 21:40:28 UTC
I'm not sure about exact matches on (I can't get no) satisfaction, but we can now find words delimited in camelCase on mediawiki.org which is an improvement.
Comment 5 Nik Everett 2014-02-20 21:46:09 UTC
I think we've fixed most of the big problems with this.  There are a few that remain, mostly the !#$^@%#$%# kinds of searches.  We've fixed the /dev/null and (I can't get no) satisfaction  searchers.  Lowering priority.
Comment 6 Chad H. 2014-05-31 17:16:09 UTC
Considering we're better in most normal cases, I wonder if we'll be able to solve the remainder with the analysis on the non-expanded forms (cf bug 60487) which should definitely be able to find weird wikitext punctuation.
Comment 7 Thue Janus Kristensen 2014-10-24 18:08:42 UTC
Searching for just "<" returns

> An error has occurred while searching: The search backend returned an error:

Obviously such an empty error message is in itself an error, and quite confusing. Even if you decide to not return pages containing "<" for technical reasons.
Comment 8 Chad H. 2014-10-24 18:51:35 UTC
(In reply to Thue Janus Kristensen from comment #7)
> Searching for just "<" returns
> 
> > An error has occurred while searching: The search backend returned an error:
> 
> Obviously such an empty error message is in itself an error, and quite
> confusing. Even if you decide to not return pages containing "<" for
> technical reasons.

That is a different bug affecting the old search engine. See bug 66259.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links