Last modified: 2014-04-19 16:34:14 UTC
I am trying to convert the japanese wikipedia's xml dump to sql file with mwdumper and "jawiki-latest-pages-meta-history.xml", But I get a Java exception, and it doesn't work... the message is that: ----------------- Exception in thread "main" java.io.IOException: java.sql.SQLException: Incorrect string value: '\xF0\xA1\x9A\xB4' for column 'rev_user_text' at row 5014 ----------------- the status of mysql on my PC is like a following: -------------- mysql Ver 14.14 Distrib 5.1.50, for Win32 (ia32) Connection id: 4 Current database: wikidb Current user: root@localhost SSL: Not in use Using delimiter: ; Server version: 5.1.50-community MySQL Community Server (GPL) Protocol version: 10 Connection: localhost via TCP/IP Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 TCP port: 3306 Uptime: 52 min 20 sec Threads: 1 Questions: 1594 Slow queries: 1 Opens: 379 Flush tables: 1 Open tables: 3 Queries per second avg: 0.507 -------------- I don't understand very much about MySQL or Java, so it can be a simple problem. But I could not solve this by myself. So, please tell me what happens and how to solve.
I have the same issue when trying to import English Wikipedia dump enwiki-20140402-pages-articles.xml. The exception is java.io.IOException: java.sql.SQLException: Incorrect string value: '\xF0\x9D\x9E\xB1_\xF0...' for column 'page_title' at row 192. I'm using the mwdumper GUI.
I was able to work around it by creating the database with the following command: "create database wiki default character set binary;"