Last modified: 2013-12-03 18:14:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T43837, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 41837 - Make MediaWiki more RESTful
Make MediaWiki more RESTful
Status: NEW
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-06 21:29 UTC by Mark Holmquist
Modified: 2013-12-03 18:14 UTC (History)
14 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Mark Holmquist 2012-11-06 21:29:02 UTC
I just started thinking about this when I saw bug #41836 breeze by on IRC, but we *really* don't have *any* RESTful interfaces available. We should consider changing that.

Things that should get better, off the top of my head:

* Pages/articles should be GET'd, like they are now. ?action=* should probably be replaced by root-level directories like /pageinfo/* and /parse/*, also as GET queries.

* Edits should be done via POST, but they should be done *at the same URI* as the page you're editing. So "GET /wiki/Barack_Obama" and "POST /wiki/Barack_Obama" mean exactly what you'd expect.

* The API probably needs a lot of changing, but I'm not sure where to start.

These issues could probably all be separate bugs, with this as the tracker. This is decidedly a wishlist item, too, so I'll mark the priority as low and make sure it's an enhancement.

Also, this will change a lot of how the community interacts with MediaWiki, so it may need to be an extension, or at least configurable, at first. We'll absolutely need community consensus to enable this on wikipedias, and that will almost certainly take a long time. Maybe this is a project we could do incrementally.

Any thoughts welcome!
Comment 1 Mark Holmquist 2012-11-06 21:30:52 UTC
Note, the above is just brainstorming, so you should feel free to replace my stupid ideas with your own, brilliant ones :)
Comment 2 Federico "Lox" Lucignano 2012-11-06 22:02:01 UTC
Hello Mark,

it's great to see interest around new API ideas!

There's a group of people that has started collecting notes and goals over at https://www.mediawiki.org/wiki/API/API_rewrite .

Despite keeping the conversation around this topic here in the tracker is a possibility, it would be of benefit for all those interested/involved to progress on this directly in the talk page connected to that documentation, what do you think?
Comment 3 Mark Holmquist 2012-11-06 22:06:32 UTC
I added a comment to the talk page, pointing here and saying I'd accept comments on the talk page as well. I'd prefer to keep track of the process on bugzilla, but we can also mirror it to mw.org periodically.
Comment 4 Federico "Lox" Lucignano 2012-11-06 22:16:37 UTC
Makes sense :)

Did you have a chance to go through the notes linked from the API Rewrite page yet?

https://www.mediawiki.org/wiki/API/API_rewrite/Kickoff_meeting

I see you're interested in getting a better usage of the HTTP verbs and less complex URL's, you should find a lot of familiar references in that document.

Also, have you read about ROA (Resource Oriented Architecture - http://www.infoq.com/articles/roa-rest-of-rest) before?
Comment 5 Mark Holmquist 2012-11-06 22:20:36 UTC
Interesting links both!

Note that this is more than only the API, though. RESTfulness could also be applied to the inner workings, like editing and submitting other forms. Through that process we can also become a little more equipped to change things into an AJAXy interface, as opposed to keeping around GET/POST interfaces only.

(we'll still need GET/POST interfaces for non-JavaScript browsers, but AJAX is helpful for those of us on modern browsers)
Comment 6 Brad Jorsch 2012-11-06 22:28:30 UTC
(In reply to comment #0)
> * The API probably needs a lot of changing, but I'm not sure where to start.

I don't think the API (as it is, anyway) is particularly amenable to being made RESTful; for example, how do you specify 500 page titles in a RESTful API like you can do with the API's title=A|B|C|...? See some of the discussion on bug 38716 for further discussion on that.

As for other stuff, how does a "RESTful" setup handle posting for page edits versus preview versus diff without requiring the browser support JavaScript to swap out the form action?
Comment 7 Mark Holmquist 2012-11-06 22:31:59 UTC
Brad, the API stuff could use a lot of iterating, for sure. I think /api/pages/A,B,C might be legal, if not ideal. I'm not sure how to do that better, but we can always fall back to old-API in the edge cases.

As for edits/previews/diffs, POST /preview/<article> and POST /diff/<article> seems sane to me. And POSTing the actual article would be as easy as POST /wiki/<article>, of course.
Comment 8 Brad Jorsch 2012-11-06 22:35:30 UTC
(In reply to comment #7)
> Brad, the API stuff could use a lot of iterating, for sure. I think
> /api/pages/A,B,C might be legal, if not ideal. I'm not sure how to do that
> better, but we can always fall back to old-API in the edge cases.

2 APIs is unlikely to be a very good idea, unless the functionality of one is entirely a superset of the other so it is implemented as a wrapper (but then, why bother?).

> As for edits/previews/diffs, POST /preview/<article> and POST /diff/<article>
> seems sane to me. And POSTing the actual article would be as easy as POST
> /wiki/<article>, of course.

Exactly my question. With only HTML (no JavaScript), how do you convince the user's browser to post to /preview/<article> when you click one button in the form, /diff/<article> when you click another, and so on? Or is this another situation where you're having two parallel interfaces?
Comment 9 Platonides 2012-11-06 22:37:35 UTC
I'm not sure we are really interested in a REST interface. Remember that we
will still need metadata such as summaries, previous revision...

Also, it is possible for proxies to block you. Such as not knowing the method
and giving you an error. Try to explain to your ISP/company that its proxy is
blocking a RESTful interface.
Comment 10 Platonides 2012-11-06 22:39:32 UTC
> As for edits/previews/diffs, POST /preview/<article> and POST /diff/<article>
> seems sane to me. And POSTing the actual article would be as easy as POST
> /wiki/<article>, of course.

This is a headache on server config, with no clear benefit, since posting to index.php works equally well.
I thought you wanted something like PUT /wiki/Somearticle.

(and /diff/ is idempotent, so it should really be GET)
Comment 11 Mark Holmquist 2012-11-06 22:54:21 UTC
Brad, I agree about the 2 APIs problem. But I also think we have some brilliant minds that may be able to come up with a solution!

About the different POST actions, I think we already must access different URLs for the actions, unless I misunderstand the way the edit page works. I admit to only having looked at that code once or twice, but I think it's relatively similar.

Platonides, that's a good point, and we may need to mostly stick to GET and POST at the beginning. I'm sure there are other ways to solve that problem--I agree with Brad, that two APIs may not be a great idea, but it may solve some of these problems in the interim.

To your second comment, while index.php works, you're not actually asking for index.php, and you're certainly not modifying it. That's the benefit. Maybe it's so marginal at this point that we might not want to do it :)

/diff/article, however, would require that you POST your version of the code--or did I misunderstand what Brad was asking? It's not idempotent, though, because the response to POST /diff/Testing will be different on the second request if the second request includes a different text.

As for PUT /wiki/Somearticle, we'd want to overwrite an article that was already there, barring a dirty diff, so POST seems like the way to go. IIRC "PUT" means "add it, unless it exists, then throw this away". My HTTP dialects may be a little rusty, though :)
Comment 12 Ori Livneh 2012-11-06 23:09:21 UTC
Broadly speaking, a good API should fulfill two requirements:

1) It ought to be meaningful, intuitive, useful, etc. to human beings (the 'soft' requirement).
2) It ought to be performant, stable, and secure (the 'hard' requirement).

We might be able to use our current API to make progress by decoupling the two requirements. We could build a thin wrapper around the current API: each request to the new API would get munged and re-written as a request to the old API. This won't be especially fast, but it would allow us to iterate on the soft requirement and experiment with different URL schemes and the like while relying on the security and stability of the current API.

Once we're satisfied that the new API meets the soft requirement (i.e., it's intuitive, useful, etc.) we could, in a piecemeal fashion, put real implementations behind the new API.
Comment 13 Brad Jorsch 2012-11-06 23:13:32 UTC
(In reply to comment #11)
> 
> About the different POST actions, I think we already must access different URLs
> for the actions, unless I misunderstand the way the edit page works. I admit to
> only having looked at that code once or twice, but I think it's relatively
> similar.

Not exactly. HTML forms have specific support for knowing which of multiple submit buttons was clicked. Each button has a different name: the Save page button is "wpSave", the Preview button is wpPreview, and the Show changes button is "wpDiff". Depending on which button you click, that button's name gets included in the POST data along with the form text, edit summary, hidden fields, and so on.

The big difference here is that this is all handled by the browser, no scripting necessary.
Comment 14 Platonides 2012-11-06 23:16:20 UTC
/diff/1-2/article may always mean "compare versions one and two of this article". Or just put the revisions in the query string as we do now.


No, PUT means "place this in that location". It is normal that it replaces existing content. https://tools.ietf.org/html/rfc2616#section-9.6

A problem with less-common methods is that it's harder to hook to them in the server config, though.
Comment 15 Bawolff (Brian Wolff) 2012-11-07 01:43:23 UTC
(In reply to comment #0)
> 
> * Pages/articles should be GET'd, like they are now. ?action=* should probably
> be replaced by root-level directories like /pageinfo/* and /parse/*, also as
> GET queries.

I'm not sure if I understand correctly, but if you mean from the index.php side of things (not api), we've had support for that for a long time. http://wikitech.wikimedia.org/history/Main_Page vs http://wikitech.wikimedia.org/edit/Main_Page etc. Very few people use the feature though


----
/me not a fan of restful stuff in general, usually seems to be a solution in search of a problem.
Comment 16 Siebrand Mazeland 2012-11-07 03:08:49 UTC
Meeting notes: https://www.mediawiki.org/wiki/API/API_rewrite/Kickoff_meeting
Comment 17 Daniel Friesen 2012-11-07 08:41:48 UTC
I did some actual severe reading on "REST". Unfortunately it seems that "REST" is now used to refer to two completely different things.

The original REST. This REST is the original that Roy Fielding defined in his doctoral dissertation. The goal of this REST is NOT what you see in all these per-site/per-software proprietary APIs. But is actually intended for long term things that work at the scale of the entire internet. Unfortunately even after reading and understanding it I'm not quite sure I could even explain it that well. Anyways, no one who says they use REST is actually using rest. Better -- but perhaps not perfect -- examples of REST clients, would be web browsers and Atom feed readers.

The second REST. This REST is the rest practically everyone is really talking about when they say they use rest. While in a way based on the real original rest. This REST is really so twisted, distorted, and defiled that the only similarities between it and the original rest amount to using HTTP properly when you use it (ie: practice things that are good for caching, and never use GET for things that trigger an action). The rest of this REST are not just different from the original, many of them actively violate the core principles of actual REST.

And don't bother with our Wikipedia article. It basically tries to fuse both types of REST into one page as if they were one and the same. The result being a page that at best contradicts itself. And it's not something all that easy to fix.
----
Now the original REST is for internet scale things. The goal is different than what the goal of our API is. While implementing this type of REST would be a very admirable goal, it would have basically nothing to do with our API. And it would be a different kind of project.

As for the buzzword type REST. This type of REST has limited use. There aren't really any advantages to it. Many of the supposed advantages of this type of rest such as cacheability and statelessness are really just advantages of using HTTP intelligently and have nothing to to with the specific REST patterns and restrictions. ie: This type of REST is about as worthless as MVC frameworks are to the good practice of MVC.

And now. I should probably point out something. Switching from our query parameters to /pageinfo/... is not REST. The real original REST does not mandate url formats. And using a root like /pageinfo/ also violates the url patterns used in the mangled REST.

----
So the general closing point. Put some actual critical thought into each idea you have for the api and decide whether it's a good idea based on the pros and cons specific to it. Instead of clinging to the idea because some desecrated buzzword says you should do it.
Comment 18 Federico "Lox" Lucignano 2012-11-10 02:08:38 UTC
Hello guys,

sorry for not getting back to this earlier (those are some busy days) :)

I'd like to clarify on an important detail: the new API proposal me and Siebrand linked to it's not necessarily about re-writing the current API (one would argue that "API rewrite", the title of the page on MW.org, doesn't really help from this perspective and if that is the wording also in the kickoff notes, then those are inaccurate as notes taken while people was brainstorming/discussing can usually be), it's about creating a new one using a totally different design for an "high REST service" ("internet scale REST" to use Daniel's words) using the ROA approach (since REST, like OOP, is a design criteria and ROA is an actual architectural design), just to say: since REST is protocol agnostic, we're inquiring alternate transport protocols as an addition to HTTP(S) too, with the side goal/benefit of making MediaWiki a fully programmable (not-only-web) service.

Starting afresh will let us structure the whole thing with some built-in features that are not easy (at times almost impossible) to embed in the current API, such as authentication (OAuth), performance (caching, hiphop compliance), stability (versioning), accessibility (Service Description Language, automated docs based on PHPDoc, normalized URI's, data schema, existing REST client libraries), quality (unit tests) and re-usability (we're figuring out a way to use the API classes for developing MediaWiki extensions without the mediation of FauxRequest) just to mention a few.

Those have been recognized as improvements upon the current situation during the meeting between the Foundation and Wikia; of course there might be challenges ahead, but that's why we decided to cooperate and learn from each other's experiences.

During the same meeting a possible transition proposal has been mentioned ("Legacy support for a while.... maybe support and keep updating the old one while building the new one. New endpoint, not backwards-compatible.", to use the exact wording from the notes), this is something which doesn't fall into the domain of the RFC work we've started recently; there are benefits and problems that should be analyzed from many different perspectives in both keeping and deprecating the current API (e.g. breaking old clients, updating/refactor code, authentication via keys, maintaining both versions, quotas, performance/caching in no specific order and grouping) and overall that process is not presented as a goal.

TL;DR: a new API proposal doesn't necessarily mean a death sentence for MW's API; an RFC, when ready, will be published and will represent just the research work done by Wikia in cooperation with the Foundation to see where this idea can lead MediaWiki as a programmable service/platform.

All your feedback is greatly appreciated (especially what you think is good/bad in the current solution, what you would like to see being added/removed/done differently), this is all information that is extremely valuable at this time as it can help us in designing a better solution.

Now onto Mark's proposal of integrating a RESTful interface in the rest of the platform: the example for the article creation/diff could be simple to address if we think of the Article class being exposed as an addressable resource and that by no means REST excludes the usage of parameters in general when they're  necessary, forms could use directly that resource via the REST entry-point with no need to modify any current behavior.

And now a small request: I know that some like to keep this in the tracker, but it would be great if we could move this conversation to the mediawiki-api mailing list (http://lists.wikimedia.org/pipermail/mediawiki-api/) where other API consumers and developers could join us in this engaging conversation.
Comment 19 Daniel Friesen 2012-11-10 02:42:35 UTC
(In reply to comment #18)
> Hello guys,
> 
> sorry for not getting back to this earlier (those are some busy days) :)
> 
> I'd like to clarify on an important detail: the new API proposal me and
> Siebrand linked to it's not necessarily about re-writing the current API (one
> would argue that "API rewrite", the title of the page on MW.org, doesn't really
> help from this perspective and if that is the wording also in the kickoff
> notes, then those are inaccurate as notes taken while people was
> brainstorming/discussing can usually be), it's about creating a new one using a
> totally different design for an "high REST service" ("internet scale REST" to
> use Daniel's words) using the ROA approach (since REST, like OOP, is a design
> criteria and ROA is an actual architectural design), just to say: since REST is
> protocol agnostic, we're inquiring alternate transport protocols as an addition
> to HTTP(S) too, with the side goal/benefit of making MediaWiki a fully
> programmable (not-only-web) service.
> [...]

Uhm, are you certain that you're actually talking about the original "internet scale REST" I talked about, not the modern buzzword REST.

Because the Kickoff page looks to be in complete conflict with that assertion.

The page links to references about buzzword REST and has piles and references to CRUD (PUT, DELETE, etc...) when CRUD has nothing to do with the original REST. The author of that REST even explained explicitly that PUT is not necessary, it can work perfectly fine with POST only.
Comment 20 Federico "Lox" Lucignano 2012-11-10 03:03:25 UTC
Daniel,

I understand your concerns and confusion, please consider that those notes have been taken by some participants in real time on etherpad while the discussion was going on in a group of ~10 participants with brainstorming included.

I wouldn't take that as a white paper for the proposal ;)

You can trust my word, as I'm currently the main developer of the proposal, or you can wait for the RFC to get public when we'll complete the research.

If it was just about the "buzzword" you're mentioning (which BTW is called "low REST" or REST-RPC hybrid, the definition and the differences with "high REST" are perfectly depicted in "RESTful Web Services" by Leonard Richardson and Sam Ruby, pp 68-69, but there are many other interesting sources), we'd have gone for a simple wrapper around the current API.

Instead we're evaluating a real ROA (which is a real world architecture, not another marketing term) approach and protocol agnosticism which, to be perfectly executed, require us to write a new entry-point.
Comment 21 Mark Holmquist 2013-01-09 17:57:27 UTC
Brad, Ori has kindly loaned me "RESTful Web Services", and I'm about halfway through. I have to say that it looks like there are some cool new (well, "new" in the sense that IE8 is new) HTML5-y features that we can use to sidestep the issues you've brought up. Admittedly that's not a perfect solution, and it might not work out for Wikipedias (since we can't very well deploy changes to Wikipedias that would break e.g. IE6), but it's a start. I'll keep reading, and possibly play with some little changes, but of course this will be a longer process than a few patches, so bear with us :)

And while having two sets of URIs for a single API might be....less than good, we could certainly do that until things are more stable.
Comment 22 Bawolff (Brian Wolff) 2013-01-09 19:00:25 UTC
(In reply to comment #21)
> Brad, Ori has kindly loaned me "RESTful Web Services", and I'm about halfway
> through. I have to say that it looks like there are some cool new (well,
> "new"
> in the sense that IE8 is new) HTML5-y features that we can use to sidestep
> the
> issues you've brought up. Admittedly that's not a perfect solution, and it
> might not work out for Wikipedias (since we can't very well deploy changes to
> Wikipedias that would break e.g. IE6), but it's a start.

If these new features are essentially eye  candy, I don't think it is a good idea to use such new technology just for eye candy.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links