Last modified: 2014-11-09 16:07:59 UTC
Escaping the Gzip'd text is broken when running CompressOld. It's not properly escaping apostrophes for some reason so it generates a syntax error.
Can you give the exact error?
# php maintenance/storage/compressOld.php Compressing database wikidb ---------------------------------------------------------------------------- Starting from 0 of 8301 1 Main_Page .xxx/ 2 Help:Assigning_permissions [...] 150 MediaWiki:Common.js/watchlist.js ..PHP Warning: pg_query(): Query failed: ERROR: syntax error at or near "NPwF2" LINE 1: ...N!Ć{J©ZRMUtZvkyп6}=b̛}vt&89\t]\q.t;''z'NPwF23?=... ^ in /var/www/com/w/includes/db/DatabasePostgres.php on line 607 Warning: pg_query(): Query failed: ERROR: syntax error at or near "NPwF2" LINE 1: ...N!Ć{J©ZRMUtZvkyп6}=b̛}vt&89\t]\q.t;''z'NPwF23?=... ^ in /var/www/com/w/includes/db/DatabasePostgres.php on line 607 A database error has occurred Query: UPDATE pagecontent SET old_text = 'O:27:"ConcatenatedGzipHistoryBlob":4:{s:8:"mVersion";i:0;s:11:"mCompressed";b:1;s:6:"mItems";s:1001:"▒V▒o▒H▒▒_1▒N▒!Ć▒▒▒▒{J©▒ZR▒MU▒tZv▒▒▒ky▒▒п▒▒6▒▒▒}=▒▒b▒̛}▒vt▒&89\t▒]▒\▒▒▒▒q▒▒▒▒.▒t▒;▒▒''▒z▒▒'▒NP▒▒wF▒2▒3▒▒?▒=▒▒3▒▒▒▒Z▒o6▒▒$▒Ϭ▒ʀ▒0g▒Oci▒Q▒▒▒_^5▒▒',old_flags = 'object,utf-8' WHERE old_id = '151' Function: Database::update Error: 1 ERROR: syntax error at or near "NPwF2" LINE 1: ...N!Ć{J©ZRMUtZvkyп6}=b̛}vt&89\t]\q.t;''z'NPwF23?=... ^
Occurs with r107680 and PostgreSQL 8.4.4 as well: | [tim@passepartout /var/www/html/w/maintenance]$ php storage/compressOld.php | Compressing database tim | ---------------------------------------------------------------------------- | Starting from 0 of 68 | 1 Main_Page | [...] | 10 Template:Mapsources .....PHP Warning: pg_query(): Query failed: FEHLER: ungültige Byte-Sequenz für Kodierung »UTF8«: 0xecbd5b | HINT: Dieser Fehler kann auch auftreten, wenn die Bytesequenz nicht mit der Kodierung übereinstimmt, die der Server erwartet, welche durch »client_encoding« bestimmt wird. in /var/www/html/w/includes/db/DatabasePostgres.php on line 254 | A database error has occurred. Did you forget to run maintenance/update.php after upgrading? See: https://www.mediawiki.org/wiki/Manual:Upgrading#Run_the_update_script | Query: UPDATE "timipedia"."pagecontent" SET old_text = 'O:27:"ConcatenatedGzipHistoryBlob":4:{s:8:"mVersion";i:0;s:11:"mCompressed";b:1;s:6:"mItems";s:73995:"��[s� �c����x���j5u���n#��� | X�2�օ*Q������/��p6N��y�s6bw6���?���?�L��X7�%Y�ձ;�����D"��L$2i}����Wj�GkՕ�^mu��N�V�Uk�:�N��Q�=���������������*����1���%g�Gzdlyd��q-B�]B�Am[3��TFtF/▒�3�i��$��U�s�2�.ՇĆ�����shGC���h&wOu�5�_��Գ����7g��HT�r��I]�K�}W,���vAT�r�<r-U�t�s�#�ݱΔG=�t�\�b�����G?}���c��.m����Λ�����n����İ�dC���.@Ƃ:T��3�u]��1�uk��`GN��C�s���S�ڗ�[��%���ן�h�',old_flags = 'object,utf-8' WHERE old_id = '11' | Function: DatabaseBase::update | Error: 1 FEHLER: ungültige Byte-Sequenz für Kodierung »UTF8«: 0xecbd5b | HINT: Dieser Fehler kann auch auftreten, wenn die Bytesequenz nicht mit der Kodierung übereinstimmt, die der Server erwartet, welche durch »client_encoding« bestimmt wird. | [tim@passepartout /var/www/html/w/maintenance]$
The "culprit" in the background is "pagecontent.old_text" with the type "text". I don't see how binary data can be stored there without escaping: | tim=# CREATE TEMPORARY TABLE tmpTest (t TEXT); | CREATE TABLE | tim=# INSERT INTO tmpTest (t) VALUES (E'\0'); | ERROR: invalid byte sequence for encoding "UTF8": 0x00 | TIP: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". | tim=# INSERT INTO tmpTest (t) VALUES (encode(E'\\000'::BYTEA, 'escape')); | INSERT 0 1 | tim=# If it were just for compressing old pages, I'd suggest leaving this problem to PostgreSQL which is much better at that while not bothering the user. But this can also occur with serialized objects in compressWithConcat (). Instead of trying to mimic bad habits here, I would refer PostgreSQL users to External Storage and close up compressOld.php for PostgreSQL databases without External Storage so that it doesn't try to store binary data in text attributes.