Last modified: 2010-10-16 00:09:38 UTC
The action=parse API module was disabled because it was suspected of causing server load (or possibly bandwidth) issues. It should be re-enabled at some point. The relevant server admin log entry is here: http://wikitech.wikimedia.org/index.php?diff=28854&oldid=28853
I sincerely hope it is re-enabled soon. Now my application === broken :(
Dispenser noted in #wikimedia-tech that action=parse uses the parser cache, the problem that needs disabling should be just misses.
If you're using action=parse then please describe your application and typical requests on this bug report.
We're using action=parse on wikinews for the preview of a javascript tool that helps users change a template. ( [[n:WN:ML]] ). An example request would look something like the following data posted to the api: action: parse format: xml prop: text pst: true title: Main Page text: {{Lead 2.0 |id=3 <!-- do not change. Each lead must have its own unique ID --> |image=Kenny McKinley.JPG |width=100x100px |type=none |title=Denver Broncos player Kenny McKinley found dead aged 23 |short_title= |summary=Kenny McKinley, an american football player for Denver Broncos, has been found dead at the age of 23. }}
The WP 1.0 tools use it (http://toolserver.org/~enwp10 and subpages). The results from the API are (supposed to be) cached in a local database on toolserver with a 12 hour expiry, to lighten the load on the API. The things that are parsed are: 1) Templates such as [[en:Template:B-Class]], to get the formatting that has been specified inside them. This formatting is used to make the toolserver program give similar output to on-wiki tables. It would be silly to manually keep the web tool code in sync with the templates. And the overall collection of class templates is not predefined and can grow at the whim of a wikiproject. 2) The page [[en:User:SelectionBot/HomePage]] is parsed to fill in the contents of the http://toolserver.org/~enwp10 . The goal of this was to prevent, as much as possible, hard-coding formatting into the web program that could be or should be updatable by people other than the tool's maintainer. If there is a caching failure in this tool, it will be somewhat apparent in the WMF server logs, because the requests would be coming from the toolserver's web server. My logs show 4367 invocations in the last 24 hours.
I use it in [[Wikipedia:RefToolbar 2.0]] to do previews of citation templates. This is one of the most popular gadgets on the English Wikipedia, though I don't know how much this feature is used and this feature is only available with the new toolbar version, so not all users have access to it. Every parse request is manually triggered by the user. Example request: action: parse title: wgPageName prop: text format: json text: {{cite web|url=http://www.scu.edu.au/news/media.php?item_id=1023&action=show_item&type=M|title=Birds learn to eat cane toads safely |last=Marchant|first=Gillian|date=26 November 2007|work=Southern Cross University website|publisher=Southern Cross University|accessdate=2009-05-09}}
A third use from the WP 1.0 bot: parsing tables such as [[en:User:WP_1.0_bot/Tables/Project/Libraries]] if people want to see them in the web tool instead of on the wiki.
What about this? http://commons.wikimedia.org/w/api.php?action=parse&pst&text=%7B%7BMediaWiki%3AImageAnnotatorTexts%7Clive%3D1%7D%7D&title=API&prop=text&uselang=en&maxage=14400&smaxage=14400&format=json There's a lot of those.
(In reply to comment #2) > Dispenser noted in #wikimedia-tech that action=parse uses the parser cache, the > problem that needs disabling should be just misses. It appears that the problem was caused by squid cache hits, not parser cache misses. The byte hit ratio at sq33:3128 spiked from 18% to 92%.
Created attachment 7695 [details] sq33:3128 hit ratio
Our application is a free iPad application with more than 200,000 users. It's been the #1 free app during its launch for a week a couple months ago, and never left the top 10 in the Lifestyle category. As you can in its description, we're trying to create interesting alternative presentations of Wikipedia content to really make it look great on iPad: http://itunes.apple.com/us/app/id384224429?mt=8 Discover uses the parse action as a more efficient way to retrieve the contents of the pages from Wikipedia. It always uses the "page" argument, retrieving entire pages, which therefore should be in the cache (as indicated at the very end of the parsing result with the various timestamps). As far as I understanding, the application should be a good citizen toward the Wikipedia servers and it downloads less data this way. Please re-enable "action=parse" for entire pages as soon as possible, as Discover is effectively completely broken right now. For reference, here are all the exact parse API requests I could find in the source code: * action=parse&prop=displaytitle%%7Ctext%%7Ccategories%%7Cexternallinks&page=%@&redirects&format=xml * action=parse&prop=text%%7Cimages&page=%@&redirects&format=xml * action=parse&prop=text%%7Cimages&page=%@&redirects&format=xml * action=parse&prop=text&page=Wikipedia:Featured_articles&redirects&format=xml * action=parse&prop=images&page=%@&redirects&format=xml * action=parse&prop=links&page=Wikipedia:Featured_pictures&redirects&format=xml * action=parse&prop=text&page=Template:In_the_news&redirects&format=xml
(In reply to comment #8) > What about this? > > http://commons.wikimedia.org/w/api.php?action=parse&pst&text=%7B%7BMediaWiki%3AImageAnnotatorTexts%7Clive%3D1%7D%7D&title=API&prop=text&uselang=en&maxage=14400&smaxage=14400&format=json > > There's a lot of those. [[commons:Help:Gadget-ImageAnnotator]]
[[WP:AWB]] uses action=parse for previews. This tool is used by thousands of Wikimedians. Typical requests are action=parse&prop=headhtml before the first preview and then just ordinary action=parse&prop=text in GET with title=...&text=... in POST for every preview. While previews aren't displayed by default, this feature is extensively used.
Note that other APIs don't return the same info as action=parse, so replacements is not easy. For instance, links in the page retrieved from action=parse contain an extra attribute (exists="") indicating if the link exists. The alternative action=query&prop=links doesn't provide this info.
I'm using the "parse" action for a site I'm developing, and I certainly don't want it suddenly disappearing from the API once we go live with it. The site contains radio play lists and when you click on a music track it retrieves videos, images and information about each band. The information comes from our own D/B when we have it, but falls back on Wikipedia when we don't.
Timed text pages: http://commons.wikimedia.org/wiki/Commons:Timed_Text_Demo_Page?withJS=MediaWiki:MwEmbed.js make use of the api to grab the subtitles for a given video. ( but it uses page title parse ( which should use cache ) for good measure I have added in a maxage=3600 on page parse requests to ideally hit the squids instead of the apaches, but I don't think that it has anything to do with the load issues. But most likely the issue is caused by some cache miss issue for the site wide enabled features like Image Annotator ?
Not sure if related (my guess is yes though), but the iPad and iPhone versions of Wikipanion, likely the most popular Wikipedia app on iOS, are not working anymore either (page just loads endlessly but nothing shows up).
(In reply to comment #3) > If you're using action=parse then please describe your application and typical > requests on this bug report. At fotopedia, we use action=parse in the following contexts: - server side for displayable text retrieval: /w/api.php?action=parse&format=json&prop=text&page=.... These texts are cached on our side for 30 days. So it only happens when somebody tries to display an article page on fotopedia for the first time in a month. Before the API endpoint was unactivated yesterday, we were only requiring a dozen of pages a minute. Right now, this part of fotopedia works fine, as long as the users don't wander in unexplored pages. - client side for both article search and displayable text retrieval These queries are triggered when a user adds a wikipedia article to a fotopedia page. The typical scenario is a search, followed by a series of: /w/api.php?action=parse&format=xml&prop=text&page=.... We only have a handful of regular users of the client software, so I don't expect this to be a threat to wikipedia server stability either. On the other hand, the impact on their side is important for us business-wise.
(In reply to comment #16) > But most likely the issue is caused by some cache miss issue for the site wide > enabled features like Image Annotator ? I think that ImageAnnotator is indeed the most likely culprit. As I said in comment #9, we're looking for squid cache hits, which most likely means requests with a maxage parameter. Between 14:20 and 14:29, we logged 71 requests with a maxage parameter in the 1/1000 sampled log. 47 were from ImageAnnotator, the other 24 were for [[MediaWiki:Sitenotice-translation]]. And since the sitenotice requests all went to the same URL, it's unlikely they'd hit the disk of the squids, which is what we saw. None of the logged requests came from a site other than commons. Between 13:00 and 14:00, we logged an average of 74 requests per second from ImageAnnotator. 14:00 to 15:00 saw a decline to 60 req/s, presumably because sq31-33 were toast for most of that period. I think the best thing to do for now is to disable ImageAnnotator pending a performance review. Since certain administrators on Commons like to revert me when I change things there, I will leave action=parse disabled on Commons until a regular Commons administrator removes it from [[MediaWiki:Common.js]].
Thanks for re-enabling the parse method. (In reply to comment #3) > If you're using action=parse then please describe your application and typical > requests on this bug report. For the record, my application makes requests for single pages, one at a time, from Wikipedia. An example request would be: /w/api.php?action=parse&page=Art&format=json&prop=text|revid|links|displaytitle&redirects I need to traverse the page content client side so it is essential that the content is parsed first, unless I write / implement a reliable parser myself and use the query method instead, though I hope now this is not necessary. Thanks
Hi, my web application is broken either. I need the parse method to analyze image annotations (description & license) for inclusion in our online architecture database www.archinform.net This information is cached on our server (only updated if the images are refreshed (manually initiated)). Shouldn't cause much traffic. Please reenable the parse method soon, thanks Sascha
I'm using action=parse for the featured article feed provided by wmde at <http://feeds.feedburner.com/wikimedia/wp-adt>. The feed should contain the teaser text, as it appears on the main page, as html. That is what I grab using action=parse.
Forgot to say, that a typical request looks like this: http://commons.wikimedia.org/w/api.php?action=parse&format=xml&prop=text&title=TITLE&text=TEXT
Image Annotator disabled: http://commons.wikimedia.org/w/index.php?title=MediaWiki:Common.js&diff=44218237&oldid=44154173 http://commons.wikimedia.org/w/index.php?title=MediaWiki:Gadgets-definition&diff=44218225&oldid=43070611
Cool, works again ;) Thank You!
Reenabled action=parse on Commons. Tim had previously reenabled it on all other wikis, so action=parse is now back across the board. I'll be keeping a close eye on the API Squids throughout the day.
Thanks, much appreciated!
Unfortunately the fix disabled "Image Annotator" gadget used to add localized description to over 21k files. It would be great is some solution was found to re-enable this great tool.
The localized description is not added by the Image Annotator. It is added by Template:Information: http://commons.wikimedia.org/wiki/Template:Information
(In reply to comment #29) > The localized description is not added by the Image Annotator. It is added by > Template:Information: > http://commons.wikimedia.org/wiki/Template:Information I should have known better than use catch-all word like "localized". I agree the Localized/Internationalized descriptions (in the language of the user) provided by Information, Book or Artwork templates will still be there, but descriptions linked to a specific locations in the image ("localized"?) are gone. Those descriptions were used to annotate a each face in a group image (replacing "5th head, with hat, in the 6th row" kind of descriptions), each building in a panorama, a signature or other inscription in a painting. A lot of effort was put into annotating images to provide more information to the final user. For example every known person in the famous http://commons.wikimedia.org/wiki/File:Stroop_Report_-_Warsaw_Ghetto_Uprising_06b.jpg was identified.
ImageAnnotator is enabled by default on hu.wikipedia (though only actually used on a handful of images). Is that a problem, or is it OK to use on low-traffic sites?
The dicussion at [[Commons:Commons:Administrators'_noticeboard#Stats]] suggests that this isn't related to ImageAnnotator. The current fix for this problem broke a lot of pages on Commons. Please reexamine this problem.
(In reply to comment #31) > ImageAnnotator is enabled by default on hu.wikipedia (though only actually used > on a handful of images). Is that a problem, or is it OK to use on low-traffic > sites? Yes it's OK to use it on hu.wikipedia.org for now. The problem was an overload, that's why it happened at the weekly peak time. (In reply to comment #32) > The dicussion at [[Commons:Commons:Administrators'_noticeboard#Stats]] suggests > that this isn't related to ImageAnnotator. All I see there is one single person (Slomox) doing some wishful thinking.
(In reply to comment #32) > The dicussion at [[Commons:Commons:Administrators'_noticeboard#Stats]] suggests > that this isn't related to ImageAnnotator. > > The current fix for this problem broke a lot of pages on Commons. Please > reexamine this problem. I'm re-resolving this bug as "fixed." This bug was about getting action=parse re-enabled on Wikimedia wikis. The bug summary and comment 0 both make this clear. It's quite possible that other issues have been exposed subsequent to this bug. In particular, there should probably be a bug about ImageAnnotator being turned into an extension, if one hasn't been filed already. But that doesn't change the resolution of this bug. If there are new issues, file separate bugs. This issue (i.e., action=parse being disabled on Wikimedia wikis), as far as I'm aware, is completely resolved.
Ok, I re-opened it mainly because of the investigation part, but obviously if it's just the "re-enable" part that is important, no problem then.
ImageAnnotator can be re-enabled for now. Further testing indicates that the "byte hit ratio" figure in squid includes error messages, and a lot of them were probably being sent at the time in question. The error counters (server.http.errors and client_http.errors) are apparently broken and never incremented. The issue occurred at peak time, disabling action=purge probably reduced the server load to slightly below peak, bringing demand back under capacity. There are several things we could have disabled which would have had the same effect.