Last modified: 2010-07-21 11:01:45 UTC
Range: This problem concerns bots, and is present on SR.WP, but could be present on all wikis that use language conversion. Time of occurrence: By the best of my observation, it came along with a new method of login authentication. Description: As probably yet known, an article on SR.WP can have title either in Cyrillic or Latin script, but a simple user can also type the title in the other script to get the proper content. For example: The article called "Вектор" could be also called as "Vektor", and the bot would be getting the contents of the article. This is how it worked for the bots previously: they could request title in either script and get the right results. Since the things changed, the bot MUST pick the right script or it will get result as if the page didn't exist. Can this be brought to the previous state?
Hmm. Can you provide me some links for test?
(In reply to comment #1) > Hmm. Can you provide me some links for test? http://sr.wikipedia.org/w/api.php?titles=Vektor|%D0%92%D0%B5%D0%BA%D1%82%D0%BE%D1%80&action=query&prop=info show that Vektor and Вектор are treated differently from the api, even though http://sr.wikipedia.org/wiki/%D0%92%D0%B5%D0%BA%D1%82%D0%BE%D1%80 and http://sr.wikipedia.org/wiki/Vektor link to the same page
I think it's because the API returns a "real" result from database. Titles are auto converted in Language Converter but still remain differences in database. Perhaps we can an extra parameter to provide a "variant-insensitive" query.
typo: Perhaps we can an => Perhaps we can add an
I'd imagine (Without being all that familiar with the api internals, so all of this is imho, take with a grain of salt, etc) that a good way to handle it would be similar to how redirects are handled with the &redirects parameter
In my own bot framework, I provide a function which wrap a title with [[ ]] and send it to action=parse to get any possible existing title in another variant. See also bug 24052.
Philip, can you provide a pointer where MediaWiki normally handles this?
Bryan, I think that it worked normally back in January 2010. What I am certain of is that it worked normally since 2008/2009, when I started Wiki botwork.
Created attachment 7558 [details] Patch to ApiPageSet I don't know why it has worked before. There appears to be no relevant code in ApiPageSet, which is where I would expect such code to reside. There are also no revisions that are related to this. Attached patch should work, but I have not tested it since I do not have a wiki with variants set up.
(In reply to comment #9) > Created an attachment (id=7558) [details] > Patch to ApiPageSet > If your patch changes the current behavior, please add a parameter for it and do not activate it by default. I'm always using API as a way to avoid language converter.
> I'm always using API as a way to avoid language converter. Just curiosity, why are you avoiding it?
(In reply to comment #10) > (In reply to comment #9) > > Created an attachment (id=7558) [details] [details] > > Patch to ApiPageSet > > > > If your patch changes the current behavior, please add a parameter for it and > do not activate it by default. I'm always using API as a way to avoid language > converter. The normalization performed will be in the normalization section, but this would indeed be a breaking-change, so we may want to introduce a parameter to explicitly enable conversion.
Maybe the switch should be per Wiki. As a bot owner on SR.WP, I can't imagine a single reason why one would want to deactivate this behavior, which is why I set a question to Liangent. Beside that, I have nothing against introducing a parameter that helps changing the default behavior.
At least one reason is that zh conversion is not as simple as sr conversion, so unexpected conversions often happen and is often reported on zh.wp.
I think both mine and Liangent's comments suggest that this type of conversion should be settable per Wiki. Not (only) through a parameter.
(In reply to comment #15) > I think both mine and Liangent's comments suggest that this type of conversion > should be settable per Wiki. Not (only) through a parameter. What does your "Not (only) through a parameter." mean? I guess you misunderstood my comment.
I understood there was need that Chinese Wiki be not affected by any changes, while I also highlighted that Serbian Wiki would require just the opposite. Therefore I have suggested that letting each Wiki decide what will be its default behavior is better solution than involving a parameter that is likely to be always used on one and never on another project.
Actually I want a parameter, so bot operators can decide whether they use it or not.
I didn't use the API's ability of resolving redirects automatically mush, so that can be the reason why I didn't say "Oh, I've been waiting for this feature for a long time!". But someone else *may* really need this feature.
(In reply to comment #18) > Actually I want a parameter, so bot operators can decide whether they use it or > not. As I stated above (#13), introducing a parameter to change default behavior is ok thing. But it still doesn't appear to be the best solution on the global scope while setting default behavior per Wiki does.
Created attachment 7560 [details] Patch that does not change default behaviour Attached patch will not change default behaviour but instead add a converttitles parameter. This way every API user can decide by themselves whether they need title conversion or not.
Could somebody who has title conversion enabled test this patch? I don't have it.
This seems to be working. It just produces a warning when the argument 'converttitles' is not sent: Warning: Invalid argument supplied for foreach() in C:\root\A\Apache2\htdocs\mw\includes\api\ApiQuery.php on line 298 My suggestion about this patch is to also make the parameter name shorter ('ct', for instance), since it is going to be called frequently.
Fixed in r69237. I had to make some minor changes, so please test it again. I did not choose to abbreviate converttitles because it is unambiguous about its meaning, while ct is not. Besides mostly bots will use the API, so you only have to type once :)
Well, there might be something else unambiguous and shorter. I suggested that not in order to spare my bot from longtyping but to save some traffic of WM servers. Actually that is why this thing should be settable per Wiki (i.e. every Wiki should be able to activate or deactivate it by default). Oh, well...
(In reply to comment #25) > Well, there might be something else unambiguous and shorter. > > I suggested that not in order to spare my bot from longtyping but to save some > traffic of WM servers. Actually that is why this thing should be settable per > Wiki (i.e. every Wiki should be able to activate or deactivate it by default). > Oh, well... Other stuff, such as HTTP headers, produces much more traffic.
(In reply to comment #26) > Other stuff, such as HTTP headers, produces much more traffic. Now, can you really shrink them? No. While there are other things you can.
But WMF doesn't shrink them. This means several extra bytes in requests are not a big problem.
I purely disagree with that one, since every byte is worth some money which doesn't fall off from the sky.