Last modified: 2014-03-13 16:15:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T33697, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 31697 - in languages where the "User:" space has more than one gender flavor, autocomplete suggests pages from user space even before user types ":"
in languages where the "User:" space has more than one gender flavor, autocom...
Status: REOPENED
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
1.21.x
All All
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 32376
Blocks: gender 32634
  Show dependency treegraph
 
Reported: 2011-10-14 14:33 UTC by kipod
Modified: 2014-03-13 16:15 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description kipod 2011-10-14 14:33:24 UTC
Autocomplete functionality is supposed to show only pages from article space (namespace 0), at least until one enters the first ":".

In Hebrew, when you type in "משתמ" (the first 4 character of the localized word for "User"), autocomplete suggests pages in user page, and shows only pages in the user space of users who have set their gender to "Female" in preferences.

IOW, users whose "User:" translates to "משתמשת:" rather than the "standard", which is "משתמש:".
Comment 1 Niklas Laxström 2011-10-14 14:38:31 UTC
Regression in behavior, marking highest priority.
Comment 2 Brion Vibber 2011-10-14 20:42:12 UTC
Hmm, that's not what it's "supposed" to do so much as what it happens to do because of the default implementation. Whether it's a desirable behavior I don't know if we've thought about that.
Comment 3 kipod 2011-10-15 19:42:33 UTC
(In reply to comment #2)
> Hmm, that's not what it's "supposed" to do so much as what it happens to do
> because of the default implementation. Whether it's a desirable behavior I
> don't know if we've thought about that.

it seems reasonable that "autocomplete" should behave the same as "search". search by default is limited to namespace 0, and so should autocomplete, until the user types the first ":". 

even if one thinks that typing the first few character from the localised word for "User" should show pages from the "User:" space, it makes absolutely no sense to bring only pages of female users. IOW - bug.
Comment 4 Niklas Laxström 2011-10-17 13:13:47 UTC
Does WMF have any PrefixSearchBackend hook users or is it using the default implementation?
Comment 5 Mark A. Hershberger 2011-10-20 14:00:43 UTC
The patch on Bug 31602 seems to show that this is caused by HTML5 datalists.
Comment 6 Mark A. Hershberger 2011-10-20 15:00:07 UTC
see r100348 for possible fix.
Comment 7 kipod 2011-10-20 18:59:47 UTC
i did not test the patch (i would if i could...), but i do not believe the patch has anything to do with this issue. 
the issue has nothing to do with the appearance of the suggestion box, it's about its content (i.e., the data returned by the ajax call).

the problem is clearly in the API and not on the browser side, so r100348 is extremely unlikely to solve the issue.
Comment 8 Brion Vibber 2011-11-22 18:38:10 UTC
Taking a peek over this per code review request...
Comment 9 Brion Vibber 2011-11-22 18:45:40 UTC
Ok I can reproduce this on https://es.wikipedia.org/ but not on a local trunk installation.

Reverting r100348 doesn't appear to have any effect; typing 'Usuar' in the search box when set to Spanish doesn't show any user pages. Only once I get to 'Usuario:' or 'Usuaria:' or 'User:' do they show up.

It doesn't look like an API or client-end problem; it looks like something on the Lucene search backend's end (or at least the search plugin).
Comment 10 Brion Vibber 2011-11-22 18:53:09 UTC
Nothing obvious about gendered-namespace support in MWSearch extension; the logic implementing the aliases might be in the lucene backend...

This may actually be a side-effect of bug 32376.

The XML export data that the search indexer is building from will have "undefined" namespaces like 'Usuaria:' (etc) in the export, which will be unrecognized and end up getting interpreted as ns 0.

This is why on es.wikipedia.org search for 'Usuar' turns up things in 'Usuaria:' and 'Usuario Discusión:' (different from the stock 'Usuario' and 'Usuario discusión') but not 'Usuario:' since they do get interpreted correctly.
Comment 11 Brion Vibber 2011-11-22 19:49:07 UTC
r103945 switches the export to canonical form titles for bug 32376; after search index updating this should clear this problem up.

Needs merge & deployment to 1.18...
Comment 12 Brion Vibber 2011-11-22 20:21:51 UTC
Merged to REL1_18 in r103953, 1.18wmf1 in r103954.

Roan's pushing the fix live; it may not fully update until a search index rebuild is triggered, not sure when those happen.
Comment 13 Brion Vibber 2011-11-22 20:59:32 UTC
from IRC:

[12:27] <RoanKattouw> The reindexing occurs between 06:00 and 06:30 UTC I think

so hopefully it'll be clearer by tomorrow. :)
Comment 14 Brion Vibber 2011-12-01 15:08:23 UTC
Removed vandal comment.
Comment 15 Mark A. Hershberger 2011-12-14 04:39:36 UTC
Testing on eswiki, I type "usuario" and see the expected "usario" and also "usarios". I do not see any "usuario:USERNAME"

Unexpectedly, I do see "Usuario Discusión:USERNAME".

Also unexpected, when I type "usuario:", I don't see any completions except "Usuario:! DanSkammelsrod !".

When I type "usario:u" I get more completions of usernames starting with "u".

When I do this on enwiki, though, I get similar results.

Hrm... but testing "usaria", I see a completion for "Usuaria:Miss Manzana/Retiro de nominación" so I don't think this is fixed yet.

If I type "usaria:", though, I do see completions including "usario:" so maybe that is the "canonical form" that brion talks about.
Comment 16 Brion Vibber 2011-12-16 20:06:09 UTC
Looks like there are still bad entries (some 'Usuaria:...' entries come up prefix-searching for 'usuari' on es.wikipedia.org).
Comment 17 Mark A. Hershberger 2011-12-17 19:28:56 UTC
lowered priority, but will try to get Ops to help out now. https://rt.wikimedia.org/Ticket/Display.html?id=2160
Comment 18 Mark A. Hershberger 2011-12-19 17:14:54 UTC
Adding rainman so he can take a peek at this though there is some
action on https://rt.wikimedia.org/Ticket/Display.html?id=2160 so
maybe who knows.
Comment 19 Mark A. Hershberger 2011-12-19 17:48:08 UTC
Copying this comment from RT:

Um... now eswiki is showing some REALLY funky results.

Before:

> When I type "usario:u" I get more completions of usernames starting with "u".

Now I don't see user page results.

Before:

> Testing on eswiki, I type "usuario" and see the expected "usario" and also
"usarios". I do not see any "usuario:USERNAME"

Now I do not see that.  Instead, the only completion I see is Usario:Leyón/Sobre mí (plus a lot of redirects for the same user).

Before:

> When I type "usario:u" I get more completions of usernames starting with "u".

Now, no user pages are given.  (On enwiki, "user:u" gives usernames starting with "u").

Maybe I checked too soon?  But it looks worse now.
Comment 20 matanya 2012-07-27 12:42:38 UTC
Still no progress?
Comment 21 Amir E. Aharoni 2012-10-28 22:02:41 UTC
Peter, I've been told on IRC that you can look into this.
Comment 22 Siebrand Mazeland 2012-10-30 07:37:48 UTC
Assigning to Peter Youngmeister, as Amir indicated that this should be resolved by ops, and Peter allegedly has a solution (no details known).

For our infomormation: The ops ticket has seen no changes since 2012-01-12 (That's January 12, 2012 for everyone who writes dates in a funny way).
Comment 23 Siebrand Mazeland 2012-11-15 06:41:16 UTC
Peter, any updates?
Comment 24 Andre Klapper 2012-11-19 10:55:32 UTC
Entering "Usuario" in the search bar on es.wikipedia.org will list quite some "Usuario Discusión:" links, hence still valid.
Comment 25 Andre Klapper 2013-03-15 11:51:07 UTC
Peter: ping - any updates on this?
Comment 26 Nemo 2013-10-04 16:20:15 UTC
I wonder if the new prefix stuff in ElasticSearch will have an effect on this nasty bug...
Comment 27 Andre Klapper 2014-03-13 16:15:27 UTC
For the records, RT #2160 is closed "as wontfix because lsearchd is end of life and Cirrus is replacing it in the next few months".

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links