Last modified: 2010-05-15 15:59:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 12735 - UTF8 user names with PostgreSQL backend cause login errors
UTF8 user names with PostgreSQL backend cause login errors
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
All All
: Normal major (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
Blocks: postgres
  Show dependency treegraph
Reported: 2008-01-22 00:31 UTC by Antti Louko
Modified: 2010-05-15 15:59 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---

A quick patch to fix this. (541 bytes, patch)
2008-01-22 00:31 UTC, Antti Louko

Description Antti Louko 2008-01-22 00:31:16 UTC
Created attachment 4569 [details]
A quick patch to fix this.

When a user whose name contain UTF8-encoded characters, tries to log in, PostgreSQL complains of illegal characters in the query. For example, user name "Pertti Höytylä", UTF8 encoding in hex "5065727474692048c3b67974796cc3a4", causes this error.

A database error has occurred Query: UPDATE mwuser SET user_touched = 'xxx' WHERE user_id = '1122' Function: User::invalidateCache Error: 1 ERROR: invalid byte sequence for encoding "UTF8": 0xc32e HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

The problem is in includes/Database.php, where the user name put in the SQL query comment after being truncated to 15 bytes (in the encoded form). Sometimes this breaks in the mmiddle of UTF8 characters and causes invalid encoding which is not acceptable by PostgreSQL backend. Truncation should be done in UTF8-aware way.
Comment 1 Greg Sabino Mullane 2008-02-04 17:34:07 UTC
Made a fix in r30536, please test it out.

Note You need to log in before you can comment on or make changes to this bug.