Last modified: 2010-01-07 10:31:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T18355, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 16355 - The robots.txt disallows logging in.


Summary:	The robots.txt disallows logging in.

Status:	RESOLVED INVALID

Product:	Wikimedia
Classification:	Unclassified
Component:	General/Unknown (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal normal (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:	http://en.wikipedia.org/robots.txt
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2008-11-15 18:11 UTC by aliter
Modified:	2010-01-07 10:31 UTC (History)
CC List:	2 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description aliter 2008-11-15 18:11:41 UTC

Bots are assumed to log-in before making any changes. As far as I can determine there are two ways to do that, and both start with /w/. In robots.txt, bots are denied access to /w/.

So while a bot is supposed to log-in to be conformant, a conformant bot actually can't log-in.

Comment 1 X! 2008-11-15 19:44:02 UTC

That's what the API is for.

Comment 2 Casey Brown 2008-11-15 22:25:07 UTC

(In reply to comment #1)
> That's what the API is for. 
> 

Nice theory, but that doesn't really work because the api is under the /w/ directory. :-) http://en.wikipedia.org/w/api.php

Comment 3 aliter 2008-11-17 21:55:22 UTC

Hi,

It seems my mention of two ways to log-in is causing confusion for some. The two ways I know for a bot to log in are:

* http://wiki.project.org/w/api.php, which is under /w/, disallowed by robots.txt.
* http://wiki.project.org/w/index.php, which is under /w/ as well.


How then, does the standards-aware programmer make his bot log on?
(I'd continue about the actual changing of pages having the same problem, but as it happens my own bot is just a reader. Then again, the new version of the bot can just as easily write, if there's a way to legally do that.)

Comment 4 Ilmari Karonen 2010-01-07 10:31:47 UTC

Robots.txt is meant to apply to general web crawlers, not to bots designed specifically to work with the MediaWiki software.   In any case, the proper way for a bot to access modern versions of MediaWiki is via api.php only; if the API can't do something you need to do, file a bug on it.  I'm closing this one as INVALID.

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links