Last modified: 2013-10-25 17:12:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57227, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55227 - interwiki problems in km wikipedia
interwiki problems in km wikipedia
Status: RESOLVED DUPLICATE of bug 55246
Product: Pywikibot
Classification: Unclassified
interwiki.py (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 04:46 UTC by Kunal Mehta (Legoktm)
Modified: 2013-10-25 17:12 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 04:46:48 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1382/
Reported by: Anonymous user
Created on: 2011-11-27 12:57:44
Subject: interwiki problems in km wikipedia
Original description:
it seems like iw bots running different Python versions read Khmer text in a different way. Please see http://en.wikipedia.org/w/index.php?title=Angelina\_Jolie&action=history. Python 2.7.1 bot adds a link and Python 2.5.1 bot removes a link to km, but when you follow that removed link it in fact points to nothing. Is there any way to fix the problem?
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 04:46:50 UTC
Seem unicode bug \#3081100 is back
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 04:46:52 UTC
Interwiki bots running under python 2.7.1 should just be blocked indefinitely for not paying attention to the pwb mailing list and console warnings.
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 04:46:54 UTC
I guess it is vice versa. py 2.5.1 does this failure but unicode test sounds ok \[1\]. I checked these links and found the last 3 characters are missed at 2.5.1-Bot.

\[1\]: http://ru.wikipedia.org/wiki/%D0%9E%D0%B1%D1%81%D1%83%D0%B6%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5\_%D1%83%D1%87%D0%B0%D1%81%D1%82%D0%BD%D0%B8%D0%BA%D0%B0:Volkov\#Khmer\_wikilinks
Comment 4 xqt 2013-10-25 16:45:51 UTC
That means we should discard py2.5 for running pwbots. This would make things easier.
Comment 5 Merlijn van Deen (test) 2013-10-25 17:09:08 UTC
Duplicate of #55256 - the cause is a buggy page name ( km:អែន​ជេ​លីណា ចូលី​ ends in \u200b zero width space ):

Not #3081100, but related. (cur | prev) 00:09, 12 November 2012‎ ElphiBot (talk | contribs)‎ m . . (95,243 bytes) (+10)‎ . . (r2.7.1) (Robot: Modifying km:អែនជេលីណា ចូលី to km:អែន​ជេ​លីណា ចូលី​) most clearly shows what is happening:

This is combined with a change in behavior -- to cite myself:

To clarify; the pywikipedia bug was caused by calling .strip() on the page
title. When working with Unicode < 4.0, this will strip the U+200B character
(python < 2.7), with Unicode > 4.0, this will *not* strip the U+200B character
(python >= 2.7).

*** This bug has been marked as a duplicate of bug 55256 ***
Comment 7 Merlijn van Deen (test) 2013-10-25 17:12:25 UTC

*** This bug has been marked as a duplicate of bug 55246 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links