Last modified: 2010-07-21 11:01:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T26296, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 24296 - Bots excluded from flexible use of article title provided by lang. conv.
Bots excluded from flexible use of article title provided by lang. conv.
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Normal trivial (vote)
: ---
Assigned To: Roan Kattouw
: patch, patch-need-review
Depends on:
Blocks: 24052
  Show dependency treegraph
 
Reported: 2010-07-07 03:25 UTC by michael.angelkovich
Modified: 2010-07-21 11:01 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to ApiPageSet (809 bytes, patch)
2010-07-07 14:20 UTC, Bryan Tong Minh
Details
Patch that does not change default behaviour (5.26 KB, patch)
2010-07-09 11:46 UTC, Bryan Tong Minh
Details

Description michael.angelkovich 2010-07-07 03:25:01 UTC
Range:
This problem concerns bots, and is present on SR.WP, but could be present on all wikis that use language conversion.

Time of occurrence:
By the best of my observation, it came along with a new method of login authentication.

Description:
As probably yet known, an article on SR.WP can have title either in Cyrillic or Latin script, but a simple user can also type the title in the other script to get the proper content. For example:

The article called "Вектор" could be also called as "Vektor", and the bot would be getting the contents of the article.

This is how it worked for the bots previously: they could request title in either script and get the right results. Since the things changed, the bot MUST pick the right script or it will get result as if the page didn't exist.

Can this be brought to the previous state?
Comment 1 Philip Tzou 2010-07-07 06:50:29 UTC
Hmm. Can you provide me some links for test?
Comment 2 Bawolff (Brian Wolff) 2010-07-07 06:55:48 UTC
(In reply to comment #1)
> Hmm. Can you provide me some links for test?

http://sr.wikipedia.org/w/api.php?titles=Vektor|%D0%92%D0%B5%D0%BA%D1%82%D0%BE%D1%80&action=query&prop=info

show that Vektor and Вектор are treated differently from the api, even though http://sr.wikipedia.org/wiki/%D0%92%D0%B5%D0%BA%D1%82%D0%BE%D1%80 and http://sr.wikipedia.org/wiki/Vektor link to the same page
Comment 3 Philip Tzou 2010-07-07 07:11:21 UTC
I think it's because the API returns a "real" result from database. Titles are auto converted in Language Converter but still remain differences in database. Perhaps we can an extra parameter to provide a "variant-insensitive" query.
Comment 4 Philip Tzou 2010-07-07 07:12:24 UTC
typo:
Perhaps we can an => Perhaps we can add an
Comment 5 Bawolff (Brian Wolff) 2010-07-07 07:15:52 UTC
I'd imagine (Without being all that familiar with the api internals, so all of this is imho, take with a grain of salt, etc) that a good way to handle it would be similar to how redirects are handled with the &redirects parameter
Comment 6 Liangent 2010-07-07 07:22:04 UTC
In my own bot framework, I provide a function which wrap a title with [[ ]] and send it to action=parse to get any possible existing title in another variant. See also bug 24052.
Comment 7 Bryan Tong Minh 2010-07-07 14:04:40 UTC
Philip, can you provide a pointer where MediaWiki normally handles this?
Comment 8 michael.angelkovich 2010-07-07 14:07:50 UTC
Bryan, I think that it worked normally back in January 2010. What I am certain of is that it worked normally since 2008/2009, when I started Wiki botwork.
Comment 9 Bryan Tong Minh 2010-07-07 14:20:26 UTC
Created attachment 7558 [details]
Patch to ApiPageSet

I don't know why it has worked before. There appears to be no relevant code in ApiPageSet, which is where I would expect such code to reside. There are also no revisions that are related to this.

Attached patch should work, but I have not tested it since I do not have a wiki with variants set up.
Comment 10 Liangent 2010-07-07 15:59:52 UTC
(In reply to comment #9)
> Created an attachment (id=7558) [details]
> Patch to ApiPageSet
> 

If your patch changes the current behavior, please add a parameter for it and do not activate it by default. I'm always using API as a way to avoid language converter.
Comment 11 michael.angelkovich 2010-07-07 16:40:02 UTC
> I'm always using API as a way to avoid language converter.

Just curiosity, why are you avoiding it?
Comment 12 Bryan Tong Minh 2010-07-07 17:44:36 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Created an attachment (id=7558) [details] [details]
> > Patch to ApiPageSet
> > 
> 
> If your patch changes the current behavior, please add a parameter for it and
> do not activate it by default. I'm always using API as a way to avoid language
> converter.

The normalization performed will be in the normalization section, but this would indeed be a breaking-change, so we may want to introduce a parameter to explicitly enable conversion.
Comment 13 michael.angelkovich 2010-07-07 18:38:59 UTC
Maybe the switch should be per Wiki. As a bot owner on SR.WP, I can't imagine a single reason why one would want to deactivate this behavior, which is why I set a question to Liangent.

Beside that, I have nothing against introducing a parameter that helps changing the default behavior.
Comment 14 Liangent 2010-07-08 02:14:29 UTC
At least one reason is that zh conversion is not as simple as sr conversion, so unexpected conversions often happen and is often reported on zh.wp.
Comment 15 michael.angelkovich 2010-07-08 13:58:57 UTC
I think both mine and Liangent's comments suggest that this type of conversion should be settable per Wiki. Not (only) through a parameter.
Comment 16 Liangent 2010-07-08 14:02:52 UTC
(In reply to comment #15)
> I think both mine and Liangent's comments suggest that this type of conversion
> should be settable per Wiki. Not (only) through a parameter.

What does your "Not (only) through a parameter." mean?

I guess you misunderstood my comment.
Comment 17 michael.angelkovich 2010-07-08 14:10:29 UTC
I understood there was need that Chinese Wiki be not affected by any changes, while I also highlighted that Serbian Wiki would require just the opposite. Therefore I have suggested that letting each Wiki decide what will be its default behavior is better solution than involving a parameter that is likely to be always used on one and never on another project.
Comment 18 Liangent 2010-07-08 14:19:49 UTC
Actually I want a parameter, so bot operators can decide whether they use it or not.
Comment 19 Liangent 2010-07-08 14:23:00 UTC
I didn't use the API's ability of resolving redirects automatically mush, so that can be the reason why I didn't say "Oh, I've been waiting for this feature for a long time!". But someone else *may* really need this feature.
Comment 20 michael.angelkovich 2010-07-08 14:37:33 UTC
(In reply to comment #18)
> Actually I want a parameter, so bot operators can decide whether they use it or
> not.

As I stated above (#13), introducing a parameter to change default behavior is ok thing. But it still doesn't appear to be the best solution on the global scope while setting default behavior per Wiki does.
Comment 21 Bryan Tong Minh 2010-07-09 11:46:47 UTC
Created attachment 7560 [details]
Patch that does not change default behaviour

Attached patch will not change default behaviour but instead add a converttitles parameter. This way every API user can decide by themselves whether they need title conversion or not.
Comment 22 Bryan Tong Minh 2010-07-09 11:47:49 UTC
Could somebody who has title conversion enabled test this patch? I don't have it.
Comment 23 michael.angelkovich 2010-07-10 11:17:07 UTC
This seems to be working. It just produces a warning when the argument 'converttitles' is not sent:

Warning: Invalid argument supplied for foreach() in C:\root\A\Apache2\htdocs\mw\includes\api\ApiQuery.php  on line 298

My suggestion about this patch is to also make the parameter name shorter ('ct', for instance), since it is going to be called frequently.
Comment 24 Bryan Tong Minh 2010-07-10 11:55:58 UTC
Fixed in r69237. I had to make some minor changes, so please test it again.

I did not choose to abbreviate converttitles because it is unambiguous about its meaning, while ct is not. Besides mostly bots will use the API, so you only have to type once :)
Comment 25 michael.angelkovich 2010-07-11 04:01:00 UTC
Well, there might be something else unambiguous and shorter.

I suggested that not in order to spare my bot from longtyping but to save some traffic of WM servers. Actually that is why this thing should be settable per Wiki (i.e. every Wiki should be able to activate or deactivate it by default). Oh, well...
Comment 26 Liangent 2010-07-11 07:29:40 UTC
(In reply to comment #25)
> Well, there might be something else unambiguous and shorter.
> 
> I suggested that not in order to spare my bot from longtyping but to save some
> traffic of WM servers. Actually that is why this thing should be settable per
> Wiki (i.e. every Wiki should be able to activate or deactivate it by default).
> Oh, well...

Other stuff, such as HTTP headers, produces much more traffic.
Comment 27 michael.angelkovich 2010-07-11 07:48:23 UTC
(In reply to comment #26)
> Other stuff, such as HTTP headers, produces much more traffic.

Now, can you really shrink them? No. While there are other things you can.
Comment 28 Liangent 2010-07-11 07:51:02 UTC
But WMF doesn't shrink them. This means several extra bytes in requests are not a big problem.
Comment 29 michael.angelkovich 2010-07-11 07:53:38 UTC
I purely disagree with that one, since every byte is worth some money which doesn't fall off from the sky.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links