Last modified: 2010-05-15 15:33:23 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3639, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1639 - PostgreSQL database encoding causes problems
PostgreSQL database encoding causes problems
Product: MediaWiki
Classification: Unclassified
Installer (Other open bugs)
All Linux
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
Blocks: postgres 385
  Show dependency treegraph
Reported: 2005-03-06 22:05 UTC by Damon Buckwalter
Modified: 2010-05-15 15:33 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Damon Buckwalter 2005-03-06 22:05:38 UTC
When installing under PostgreSQL (using directions from, if the database encoding is
set to 'UNICODE', some problems are encountered inserting items into the
"objectcache" table.  I believe the problem stems from using the 'text' column
type in PostgreSQL, versus a MEDIUMBLOB in MySQL.  Perhaps 'bytea' should be
used instead?  This does require some special formatting of the input, but is a
more analgous type to BLOBs.

Setting the database encoding to LATIN1 also avoids the problem (the 'text' type
then no longer looks for valid UTF-8 strings).
Comment 1 Brian Herlihy 2005-06-08 06:34:14 UTC
If the data is not UTF-8 encoded, then using LATIN1 client encoding will make
everything run smoothly.  But the data stored in the database will in fact be
the unicode translations of those characters you are storing.  This means that
the higher bytes are being stored as 2 or 3 bytes in unicode.  This wastes space
and requires the data to be decoded when fetched and encoded when stored.

You can save yourself a lot of headache by making the database use latin1.
Comment 2 Greg Sabino Mullane 2006-07-17 01:57:11 UTC
This field is currently bytea, (and it does cost us some contortions), so I am
closing the bug for now.

Note You need to log in before you can comment on or make changes to this bug.