Last modified: 2009-12-30 05:15:50 UTC
For a reading (and giving later back!) information from Wikipedia or other MediaWikis we need to get data from a Machine- friendly_wiki_interface. Categories help us selecting well specified content. Now the command http://de.wikipedia.org/wiki/Kategorie:Foo?action=raw" as well as Special:Export do a good job. The problem is, that for categories "action=raw" returns the wiki tags. But what we need is the evaluated content, i.e. the list of articles like they are presented in HTML in default mode. How about a keyword "evaluated" (and an option in Special:Export) in addition to action=raw? Say like this: http://de.wikipedia.org/wiki/Kategorie:Ort_in_der_Schweiz? action=raw&evaluated. Instead of ([[Kategorie: ....]] ....) this would give a list of all articles as a response.
I have looked again what has been proposed until today I found this: * The most similar to this is "Bug 208: API for external access" * And Some requests at Wikitech-l like a "Minimalistic Web-API for use by Tools nad Bots" at http://mail.wikipedia.org/pipermail/wikitech-l/2004-September/025373.html . Thes are solutions proposed so far: 1. Use Special:Export with XML (Wikipedia built-in) 2. Use action=raw (Wikipedia built-in) with Wiki-Text (e.g. with an Java-API) 3. Use the Phyton Framework on Wikipedia-HTML 4. Use Perl on a SQL-Dump of Wikipedia Remarks: 1. Special:Export uses an XML where I am not shure if it is well defined for long term use (found: http://wikipedia.sourceforge.net/xml/export-0.1.xsd). 2. action=raw is the same as users/editors see thus "the original" format (there ist a Java API prototype underway; see here http://meta.wikimedia.org/wiki/Wikimaps/GeonamenDB#Java) 3. The Phyton Framework was mentioned sometimes and we tested it: This is based on screen scraping HTML the Wikipedia-Output; so this is definitely error prone an breaks potentially after changes in Mediawiki and even on individual Style Sheets. 4. Using Perl on a SQL-Dump of Wikipedia reads Wiki-Text (like 2.) and is limited on local installations. So; the most promising approach to me is still "action=raw". But - as mentioned in this bug (= feature request) - we still need an extension in case of categories! => Can anyone point me to where to begin in the Wikimedia-PHP code? Stefan
action=raw and Special:Export are for accessing editable page text. Things like category memberships are a distinct kind of data, and require a distinct kind of interface.
This would ideally be part of some future SOAP API, marking it as a duplicate of bug 208. *** This bug has been marked as a duplicate of 208 ***
I would like to thank the former committers for paying attention to this request, but pointing to #208 resolves only part of it. Please let me point you to the following: This bug report mentions two requests ("get article by name" and "get category by category_name") and proposes one possible API as a solution via HTTP/URI (see "REST" http://c2.com/cgi/wiki?RestArchitecturalStyle). Now you propose SOAP as another solution - which is alright too. But: 1. Please don't forget in the SOAP API of #208 to include "get category..." and 2. don't consider SOAP as the only API: RESTful is most probably more adapted to this simple request with one(!) parameter for getting back an XML and/or a Wiki text stream. RESTful means simply parametrized HTTP GET (POST, UPDATE,...) calls and is definitely less time consuming as well as easier to implement than SOAP.
> 2. don't consider SOAP as the only API: RESTful is most probably more adapted to > this simple request with one(!) parameter for getting back an XML and/or a Wiki > text stream. RESTful means simply parametrized HTTP GET (POST, UPDATE,...) calls > and is definitely less time consuming as well as easier to implement than SOAP. Our needs aren't simple, ideally a SOAP api would handle all sorts of stuff such as getting edit histories, feeds, rendered html, stuff that links to page $1 and so on, also, there are standard API's for most programming languages that implement it. I'm closing the bug again, please make further comments at bug 208 if you want to suggest other APIs. *** This bug has been marked as a duplicate of 208 ***