Last modified: 2014-07-27 11:38:01 UTC
When using maintenance/importDump.php to import Namespace 10 (templates) taken from a Meta dump and run through MWDUMPER to extract just this namespace, I get two instances of the "Got bogus revision with null title!" message. Tracking back through the XML after adding some extra debug information, the two pages that cause this are Template:UTC+0800 and Template:UTC+0800s The common theme for both of these is the "+" sign. They are the only two <titles> in the XML that have a "+" sign in them. I'm guessing that there's some inconsistency between the way the XML dumps are exported, and then how they are interpreted when imported via importDumps.php, and also that there might be some front-end inconsistency in the Edit process that allows such titles with a "+" sign to be entered into the database in the first place. This may affect SpecialImport.php too since it relies on many of the same routines.
+ is a valid character in titles as of MediaWiki 1.6.0. It's not permitted in the default configuration, however. See $wgLegalTitleChars in DefaultSettings.php, and copy across to (and modify in) LocalSettings.php to permit them. This (should|might) solve it. :)
Rob Thanks for the pointer. You are right ! But, after reading all the comments in DefaultSettings.php just above the $wgLegalTitleChars initialisation, I am not convinced that adding a + sign to it is a good idea. It all seems rather prone to problems if Apache is not configured correctly. Given that the Help:Page Name documentation on Meta doesn't even mention using issues with using + signs, I thing it's still a bad idea to enable it. Between these darned plus signs, and having to alter my LocalSettings.php to deal with subpages inherited from imported XML dump files from Meta, I think we need to improve the documentation associated with importDump.php. This took me a LONG time to get to even begin to understand what is going on, and to at least do some preliminary investigation before posting this bug. If it happened to me, it will probably stump others. The "bogus" message generated during the importDump.php script doesn't give any clues as to where the error is, short of changing the $reportingInterval to 1, which itself is a headache if you're importing a dump with thousands of pages. My solution, to at least identify the specific entries, was to change the output message (very hacker-like) to be if (!$title) { $this->progress("Got bogus revision with null title! $this->pagecount" ); wfDebug("BOGUS\n"); } And to add a line to my LocalSettings.php $wgDebugLogFile = 'import.log'; to enable a log file for debug messages. And finally to modify the line in includes/SpecialImport.php so that the comment is removed from the line #wfDebug ( "IMPORT: $data\n" ); Then I could search for the word BOGUS in the log file and see what was going on. Nasty, maybe, but it worked !
The problem also occurs when importing a dump through Special:Import, except a PHP error is raised (triggered b/c $this->mTitle is not an object in Article::insertOn ()). We are left at wondering what was imported and what was not!
If a bogus title is encountered, a more intuitive error message makes sense.
(In reply to Rob Church from comment #4) > If a bogus title is encountered, a more intuitive error message makes sense. $ php maintenance/importDump.php ~/documents/temp/invalidtitle Page "Template:In>valid" was not imported because the name to which it would be imported is invalid on this wiki. Done! You might want to run rebuildrecentchanges.php to regenerate RecentChanges $ cat ~/documents/temp/invalidtitle <mediawiki xmlns="http://www.mediawiki.org/xml/export-0.8/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.8/ http://www.mediawiki.org/xml/export-0.8.xsd" version="0.8" xml:lang="en"> <page> <title>Template:In>valid</title> <ns>10</ns> <id>456</id> <revision> <id>456</id> <timestamp>1972-01-01T00:00:00Z</timestamp> <contributor> <ip>::5</ip> </contributor> <comment>Created page with 'ha'</comment> <text xml:space="preserve" bytes="2">ha</text> <sha1>t78jgj4yk5qdbeuu0kfwdxdrxmvfl65</sha1> <model>wikitext</model> <format>text/x-wiki</format> </revision> </page> </mediawiki>