Last modified: 2014-05-08 16:46:11 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59242, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57242 - CirrusSearch: Problems on the Gujarati wikipedia that look like unicode normalization issues
CirrusSearch: Problems on the Gujarati wikipedia that look like unicode norma...
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
CirrusSearch (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: utf8
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-19 13:18 UTC by Nik Everett
Modified: 2014-05-08 16:46 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nik Everett 2013-11-19 13:18:03 UTC
Hi Nik,
 
Thanks for deploying it on gu.wiki. I have been testing it so far and always found it more useful than normal search, but today I encountered an issue with this. Please see below 4 search results, 2 with cirrus search and 2 without, I think the results that I am getting with cirrus enabled are a bit unexpected. The term that I search is સૌરાષ્ટ્ર પ્રાંત
 
With Cirrus:
https://gu.wikipedia.org/w/index.php?search=%E0%AA%B8%E0%AB%8C%E0%AA%B0%E0%AA%BE%E0%AA%B7%E0%AB%8D%E0%AA%9F%E0%AB%8D%E0%AA%B0+%E0%AA%AA%E0%AB%8D%E0%AA%B0%E0%AA%BE%E0%AA%82%E0%AA%A4&button=&title=%E0%AA%B5%E0%AA%BF%E0%AA%B6%E0%AB%87%E0%AA%B7%3A%E0%AA%B6%E0%AB%8B%E0%AA%A7&srbackend=CirrusSearch
 
Without Cirrus:
https://gu.wikipedia.org/w/index.php?search=%E0%AA%B8%E0%AB%8C%E0%AA%B0%E0%AA%BE%E0%AA%B7%E0%AB%8D%E0%AA%9F%E0%AB%8D%E0%AA%B0+%E0%AA%AA%E0%AB%8D%E0%AA%B0%E0%AA%BE%E0%AA%82%E0%AA%A4&button=&title=%E0%AA%B5%E0%AA%BF%E0%AA%B6%E0%AB%87%E0%AA%B7%3A%E0%AA%B6%E0%AB%8B%E0%AA%A7
 
Search was for exact match with inverted comma: "સૌરાષ્ટ્ર પ્રાંત"
 
Without Cirrus:
https://gu.wikipedia.org/w/index.php?search=%22%E0%AA%B8%E0%AB%8C%E0%AA%B0%E0%AA%BE%E0%AA%B7%E0%AB%8D%E0%AA%9F%E0%AB%8D%E0%AA%B0+%E0%AA%AA%E0%AB%8D%E0%AA%B0%E0%AA%BE%E0%AA%82%E0%AA%A4%22&title=%E0%AA%B5%E0%AA%BF%E0%AA%B6%E0%AB%87%E0%AA%B7%3A%E0%AA%B6%E0%AB%8B%E0%AA%A7&fulltext=1
 
With Cirrus:
https://gu.wikipedia.org/w/index.php?search=%22%E0%AA%B8%E0%AB%8C%E0%AA%B0%E0%AA%BE%E0%AA%B7%E0%AB%8D%E0%AA%9F%E0%AB%8D%E0%AA%B0+%E0%AA%AA%E0%AB%8D%E0%AA%B0%E0%AA%BE%E0%AA%82%E0%AA%A4%22&title=%E0%AA%B5%E0%AA%BF%E0%AA%B6%E0%AB%87%E0%AA%B7%3A%E0%AA%B6%E0%AB%8B%E0%AA%A7&fulltext=1&srbackend=CirrusSearch
Comment 1 Nik Everett 2014-05-08 16:46:11 UTC
This took me forever to pickup but I see this:
[[સૌરાષ્ટ્ર| સૌરાષ્ટ્ર પ્રાંત]]માં
in the page source of one of the pages that lsearchd finds and cirrus doesn't.  Cirrus sees the words સૌરાષ્ટ્ર પ્રાંતમાં which lsearchd sees સૌરાષ્ટ્ર પ્રાંત માં because it inserts a space after every link.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links