Last modified: 2008-10-06 23:51:44 UTC
Spezial:Export is supposed to show that a page is locked. This works for the main page on de.wikipedia: http://de.wikipedia.org/wiki/Spezial:Exportieren/Hauptseite has the line <restrictions>edit=sysop:move=sysop</restrictions>, as it is supposed to be. http://de.wikipedia.org/wiki/Spezial:Exportieren/Adolf_Hitler also works fine, showing this line: <restrictions>edit=autoconfirmed:move=autoconfirmed</restrictions> But for other locked or semi-locked pages, this doesn't work: http://de.wikipedia.org/wiki/Sekte is fully locked, however http://de.wikipedia.org/wiki/Spezial:Exportieren/Sekte doesn't have restriction tags. http://de.wikipedia.org/wiki/Nationalsozialismus is semi-locked, but http://de.wikipedia.org/wiki/Spezial:Exportieren/Nationalsozialismus doesn't have restriction tags either.
Note that this does not only occur on the German Wikipedia, for example http://nl.wikipedia.org/wiki/Speciaal:Export/Sjabloon:Atlas also lacks restriction tags although http://nl.wikipedia.org/wiki/Sjabloon:Atlas is fully locked.
The export tool seems to use the obsolete "page.page_restrictions" field, while the table "page_restrictions" is currently used and updated. It should probably use the table.
Marking as blocking bug 700 (code quality) and increasing severity, since it's a data consistency issue.
As a note why I noticed this bug: The PyWikipediaBot Framework relies on the restrictions tag; if missing, the bot will try to edit locked pages with a non-sysop account, which will lead to an error message.
The unfortunate thing about this is it's hard to cleanly join on for the bulk export query. Hmmm, maybe a join with a COUNT(*) and then look up individual items just for the pages with protection entries? Need to test to ensure that won't bork up the speed of the query.
Created attachment 3326 [details] Attempted quick hack to use page_restrictions table The query looks ok to me, but what do I know. :) There's a few issues with this, though... a) Information drawn from page_restrictions table may be out of sync. A protected page could perhaps be deleted or have its protection levels changed between the start of the query and the time the row is read, leading to slightly inconsistent output if transactions aren't doing the right thing. b) New features such as expirations and cascade options are not reported. We should probably think about an expandable schema for protection information and toss that in. c) Who knows what else might be wrong. ;) As for the bot editing case, I have to warn that a page that's protected by cascade from another page wouldn't end up listing here anyway, so I'm not sure how much totally we can do here? Maybe something else is best?
Mass compoment change: <some> -> Export/Import
Done in r41786