Last modified: 2010-01-07 10:31:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T18355, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 16355 - The robots.txt disallows logging in.
The robots.txt disallows logging in.
Status: RESOLVED INVALID
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/robots.txt
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-15 18:11 UTC by aliter
Modified: 2010-01-07 10:31 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description aliter 2008-11-15 18:11:41 UTC
Bots are assumed to log-in before making any changes. As far as I can determine there are two ways to do that, and both start with /w/. In robots.txt, bots are denied access to /w/.

So while a bot is supposed to log-in to be conformant, a conformant bot actually can't log-in.
Comment 1 X! 2008-11-15 19:44:02 UTC
That's what the API is for. 
Comment 2 Casey Brown 2008-11-15 22:25:07 UTC
(In reply to comment #1)
> That's what the API is for. 
> 

Nice theory, but that doesn't really work because the api is under the /w/ directory. :-) http://en.wikipedia.org/w/api.php
Comment 3 aliter 2008-11-17 21:55:22 UTC
Hi,

It seems my mention of two ways to log-in is causing confusion for some. The two ways I know for a bot to log in are:

* http://wiki.project.org/w/api.php, which is under /w/, disallowed by robots.txt.
* http://wiki.project.org/w/index.php, which is under /w/ as well.


How then, does the standards-aware programmer make his bot log on?
(I'd continue about the actual changing of pages having the same problem, but as it happens my own bot is just a reader. Then again, the new version of the bot can just as easily write, if there's a way to legally do that.)
Comment 4 Ilmari Karonen 2010-01-07 10:31:47 UTC
Robots.txt is meant to apply to general web crawlers, not to bots designed specifically to work with the MediaWiki software.   In any case, the proper way for a bot to access modern versions of MediaWiki is via api.php only; if the API can't do something you need to do, file a bug on it.  I'm closing this one as INVALID.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links