Last modified: 2014-04-12 16:04:30 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T25403, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 23403 - Not importing all templates and articles
Not importing all templates and articles
Status: NEW
Product: Utilities
Classification: Unclassified
mwdumper (Other open bugs)
unspecified
PC Linux
: Normal major (vote)
: ---
Assigned To: Brion Vibber
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-04 15:27 UTC by federico.mora
Modified: 2014-04-12 16:04 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description federico.mora 2010-05-04 15:27:55 UTC
Installed:

Software:  
  	Mediawiki: 1.15.3
  	PHP 5.2.6-1+lenny8 (apache2handler)
   	MySQL	5.0.51a-24+lenny3

        Extentions installed:
   	   ExpandTemplates
	   Cite
	   ParserFunctions
	  TidyTab

Downloaded http://download.wikimedia.org/enwiki/20100312/enwiki-20100312-pages-articles.xml.bz2

Used mwdumper.jar to install wikipedia dump

Bug:

Some articles and templates are missing that are inside enwiki-20100312-pages-articles.xml

I also tried using Special:Export from wikipedia and Special:Import in my wiki and the wiki didn't seem to accept them

Test article:
  Bill_Clinton has the infoboxes missing
Comment 1 federico.mora 2010-05-04 16:06:01 UTC
mwdumper information:

URL: http://svn.wikimedia.org/svnroot/mediawiki/trunk/mwdumper
Repository Root: http://svn.wikimedia.org/svnroot/mediawiki
Repository UUID: dd0e9695-b195-4be7-bd10-2dea1a65a6b6
Revision: 65562
Node Kind: directory
Schedule: normal
Last Changed Author: rainman
Last Changed Rev: 59325
Last Changed Date: 2009-11-21 20:21:03 -0500 (Sat, 21 Nov 2009)


Special:Import seems to be working now.
Comment 2 Q. Alex Zhao 2010-08-05 01:08:43 UTC
I'm having the same problem, with the latest mwdumper, MediaWiki 1.16.0, MySQL 5.0.45, and the 2010-06-22 enwiki dump. The page I'm missing is JavaServer_Faces but I only have about 173K rows in the text table -- that's way lower than the 3.3m articles that Wikipedia claims.
Comment 3 piotr.jagielski 2014-04-12 16:04:30 UTC
I just tried to import enwiki-20140304-pages-articles.xml using mwdumper. It run succesfully with the last line saying "14á313á024 pages (4á172,545/sec), 14á313á024 revs (4á172,545/sec)". However a lot of articles were missing. When I checked the counts in page and text table it was only 2002000. I run it twice with the same result. The MD5 sum of the dump file was correct. The exact command I used was "java -Xmx512m -Xms128m -XX:NewSize=32m -XX:MaxNewSize=64m -XX:SurvivorRatio=6 -XX:+UseParallelGC -XX:GCTimeRatio=9 -XX:AdaptiveSizeDecrementScaleFactor=1 -server -jar mwdumper-1.16.jar --format=sql:1.5 --filter=namespace:NS_MAIN,NS_CATEGORY 2014-03/enwiki-20140304-pages-articles.xml | mysql -u root -p enwiki --default-character-set=utf8"

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links