Last modified: 2014-11-09 02:47:32 UTC
pywiki mostly depends on httplib2. There are a few cases of urllib.urlopen (and others) being used in the pywikibot library code, and a number of scripts which use other http request routines. Multiple routines results in multiple configuration and multiple sets of possible errors. Has there been any investigation whether requests or urllib3 would suit our needs better (e.g. offloading some problems onto another project)?
The may be issues with using httplib2 for large downloads, like are possible in upload.py. https://github.com/jcgregorio/httplib2/issues/224 A fork has been created for that, and distributed caching. https://github.com/madlag/streaming_httplib2
site.py & weblib.py use 'import urllib', but for urlencode urllib: pywikibot/page.py:1841: f = urllib.urlopen(self.fileUrl()) pywikibot/version.py:199: buf = urllib.urlopen(url).readlines() scripts/upload.py scripts/flickrripper.py scripts/checkimages.py scripts/weblinkchecker.py scripts/imagerecat.py scripts/maintenance/wikimedia_sites.py scripts/data_ingestion.py urllib2: scripts/reflinks.py httplib (not httplib2): pywikibot/version.py:123: conn = httplib.HTTPSConnection('github.com') scripts/weblinkchecker.py scripts/reflinks.py
Change 152200 had a related patch set uploaded by John Vandenberg: HTTP requests with user-agent without version https://gerrit.wikimedia.org/r/152200
Change 153300 had a related patch set uploaded by John Vandenberg: Replace httplib and urllib with httplib2 https://gerrit.wikimedia.org/r/153300
Change 152200 merged by jenkins-bot: User-agent graceful degradation https://gerrit.wikimedia.org/r/152200
Change 153300 merged by jenkins-bot: Replace httplib and urllib with httplib2 https://gerrit.wikimedia.org/r/153300
version.py now uses httplib2. In addition to the list above, generate_family_file.py also uses urllib2
https://github.com/ross/python-asynchttp might be a good solution, but it doesnt appear to be very active