Last modified: 2006-08-15 15:54:49 UTC
The XML Special:Pages export is bundling all revisions instead of just the latest,
whatever the user asks for in the export. This started on Sunday 6 August.
Here is an example in a Wiki which uses Wikipedia pages "On the fly":
http://www.wikinfo.org/wiki.php?title=Doune. The page "Doune" didn't exist in Wikinfo
(at time of writing this anyway). Wikinfo try to import from the XML and this happens.
Here is an example of my own website in a test page:
(see bottom of my page)
*** Bug 6938 has been marked as a duplicate of this bug. ***
See r15959 (Added experimental history paging API, subject to change).
This change has caused problems to everybody using GetWiki 1.0 - a lot of people.
Can this be put back ASAP? My website, for instance, which extracts Wikipedia data in
this way is no longer functional.
I see other sites, such as WikInof similarly affected.
The Special:Export page itself has a checkbox: "Include only the current revision,
not the full history". This should be the default (no history) with a manual
override - i.e, software like GetWiki should not see all revisions - the current
revision only should be the default.
Who is going to fix GetWiki? What an unhelpful comment.
It is this change to MediaWiki which has caused the problem in the first place.
Wait a moment...because a feature was added to MediaWiki, users are outright
complaining AGAIN? We're well within our rights to add stuff to the software.
External sites using our interfaces to take content will need to keep up with
However, this sounds like a duplicate of a more clarified bug report posted
after this one. Finding that and marking this as a duplicate is an exercise left
up to the reader.
I am not "users", I am a person who relies (relied) on GetWiki to import my data. I
thought we were all in this together - I didn't realise it was an "us" and "them".
This avenue is now closed to me and I have to spend the next few days inventing a new
P.S. What was the problem that this upgrade fixed?
As of r16018 (the version I have for my wiki anyways) I am still able get only
the most current revision of an article by pointing my browser at <Wiki
This is related to (but kind of the opposite of) Bug 9671
arg! sorry, typo: bug 6971
This is not a bug on MediaWiki; it's a bug on GetWiki. Its parsing code is
hopelessly broken, and can only work by chance. See my comment on
[[Wikipedia:Village pump (technical)#XML export format change]] for details.
Workaround (untested): change $wgExportwiki on GetWiki to
instead of http://en.wikipedia.org/wiki/Special:Export/; this should make
MediaWiki return exactly what it was returning before. HOWEVER, this is only a
temporary workaround; if GetWiki is not fixed, it'll probably break again the
next time anything is changed on MediaWiki's Special:Export. It's meant only as
a stopgap fix. You really should dedicate some resources to fixing that code,
before it breaks again.
Thanks very much for your help. Your workaround indeed solves the problem
(temporarily). Hopefully GetWiki will issue a version 2.0. Alternatively I will try
to become a PHP programmer in the interim!
In the past Wikipedia recommended webmasters not to download the complete html page but to use Special:Export in order not to burden their servers too much. Now all
the information about those directives is gone and we are left with a broken Special:Export.
This is not a GetWiki bug. This MediaWiki/Wikipedia creating a mess.
No, this is MediaWiki evolving to support more complicated access to the export
interface. If there is a problem keeping up with the interfaces we provide, then
consider other options; paid OAI updates or downloading our XML dumps are two such.
The format has not changed, it is exactly the same as it was. If
something is not working the same today as it did two weeks ago,
please be very specific.
Also please check the *CURRENT STATUS RIGHT NOW* as bugs
introduced earlier in the month were fixed.
Yes, this is a GetWiki bug. Go read what I wrote on the Village pump. The
GetWiki code to read the exported XML is completely wrong, so that even the
smallest change can make it break. MediaWiki didn't change anything on the
format; it's still using the same 0.3 schema, and it's not MediaWiki's fault if
GetWiki cannot follow the schema.
Fixed in r16069.