Last modified: 2006-08-15 15:54:49 UTC
The XML Special:Pages export is bundling all revisions instead of just the latest, whatever the user asks for in the export. This started on Sunday 6 August. Here is an example in a Wiki which uses Wikipedia pages "On the fly": http://www.wikinfo.org/wiki.php?title=Doune. The page "Doune" didn't exist in Wikinfo (at time of writing this anyway). Wikinfo try to import from the XML and this happens. Here is an example of my own website in a test page: http://www.globalguide.org/test.html?id=100213 (see bottom of my page)
*** Bug 6938 has been marked as a duplicate of this bug. ***
See r15959 (Added experimental history paging API, subject to change).
This change has caused problems to everybody using GetWiki 1.0 - a lot of people. Can this be put back ASAP? My website, for instance, which extracts Wikipedia data in this way is no longer functional. I see other sites, such as WikInof similarly affected.
The Special:Export page itself has a checkbox: "Include only the current revision, not the full history". This should be the default (no history) with a manual override - i.e, software like GetWiki should not see all revisions - the current revision only should be the default.
Fix GetWiki.
Who is going to fix GetWiki? What an unhelpful comment. It is this change to MediaWiki which has caused the problem in the first place.
Wait a moment...because a feature was added to MediaWiki, users are outright complaining AGAIN? We're well within our rights to add stuff to the software. External sites using our interfaces to take content will need to keep up with the developments. However, this sounds like a duplicate of a more clarified bug report posted after this one. Finding that and marking this as a duplicate is an exercise left up to the reader.
I am not "users", I am a person who relies (relied) on GetWiki to import my data. I thought we were all in this together - I didn't realise it was an "us" and "them". This avenue is now closed to me and I have to spend the next few days inventing a new solution. P.S. What was the problem that this upgrade fixed?
As of r16018 (the version I have for my wiki anyways) I am still able get only the most current revision of an article by pointing my browser at <Wiki root>/Special:Export?pages=<Article Name>&curonly=1&action=submit
This is related to (but kind of the opposite of) Bug 9671
arg! sorry, typo: bug 6971
This is not a bug on MediaWiki; it's a bug on GetWiki. Its parsing code is hopelessly broken, and can only work by chance. See my comment on [[Wikipedia:Village pump (technical)#XML export format change]] for details. Marking INVALID.
Workaround (untested): change $wgExportwiki on GetWiki to http://en.wikipedia.org/w/index.php?title=Special:Export&curonly=1&action=submit&pages= instead of http://en.wikipedia.org/wiki/Special:Export/; this should make MediaWiki return exactly what it was returning before. HOWEVER, this is only a temporary workaround; if GetWiki is not fixed, it'll probably break again the next time anything is changed on MediaWiki's Special:Export. It's meant only as a stopgap fix. You really should dedicate some resources to fixing that code, before it breaks again.
Cesar, Thanks very much for your help. Your workaround indeed solves the problem (temporarily). Hopefully GetWiki will issue a version 2.0. Alternatively I will try to become a PHP programmer in the interim!
In the past Wikipedia recommended webmasters not to download the complete html page but to use Special:Export in order not to burden their servers too much. Now all the information about those directives is gone and we are left with a broken Special:Export. This is not a GetWiki bug. This MediaWiki/Wikipedia creating a mess.
No, this is MediaWiki evolving to support more complicated access to the export interface. If there is a problem keeping up with the interfaces we provide, then consider other options; paid OAI updates or downloading our XML dumps are two such.
The format has not changed, it is exactly the same as it was. If something is not working the same today as it did two weeks ago, please be very specific. Also please check the *CURRENT STATUS RIGHT NOW* as bugs introduced earlier in the month were fixed.
Yes, this is a GetWiki bug. Go read what I wrote on the Village pump. The GetWiki code to read the exported XML is completely wrong, so that even the smallest change can make it break. MediaWiki didn't change anything on the format; it's still using the same 0.3 schema, and it's not MediaWiki's fault if GetWiki cannot follow the schema.
Fixed in r16069.