Last modified: 2012-05-05 23:59:47 UTC
There is an account 'User:Szymon Świerkosz' on labsconsole wiki, however gerrit shows it as 'Szymon ?wierkosz'. I have provided the URL for an example page.
Adjusting bug summary... I assume this is upstream, but don't really know for sure.
Probably dupe of the other gerrit unicode bug.
This is very likely an upstream problem, but it seems to be specific to user names. For example, in https://gerrit.wikimedia.org/r/4040 , Szymon's name is shown correctly in the "committer" field, but incorrectly in the "reviewer" and "owner" fields.
(In reply to comment #3) > This is very likely an upstream problem, but it seems to be specific to user > names. What about this issue suggests it's an upstream problem?
Well pretty much everything with gerrit is an upstream problem ;-) Like the other unicode bugs, we can probably work around this though.
Here's an interesting one: http://code.google.com/p/gerrit/issues/detail?id=1082 They say UTF-8 won't work with MySQL :/
MediaWiki works absolutely fine with MySQL and Unicode. The correct phasing would be Gerrit does not support Unicode when using MySQL as backend.
(In reply to comment #7) > MediaWiki works absolutely fine with MySQL and Unicode. > > The correct phasing would be Gerrit does not support Unicode when using MySQL > as backend. Was about to say, this definitely sounds like a Gerrit problem.
(In reply to comment #8) > (In reply to comment #7) > > MediaWiki works absolutely fine with MySQL and Unicode. > > > > The correct phasing would be Gerrit does not support Unicode when using MySQL > > as backend. > > Was about to say, this definitely sounds like a Gerrit problem. As I said upstream, Gerrit claiming this doesn't work is just silly. I've already theorized that we can just change the collations and this will work, but I haven't tested yet. If someone wants to test this theory, we can set you up with access to the gerrit project on labs (which is already running 2.3).
Nope - tested with 2.3-rc0-158-g34ab429 - I have utf8_unicode_ci on all MySQL tables and I get question marks. A bit newer Gerrit deployed on PostgreSQL is fine.
(In reply to comment #10) > Nope - tested with 2.3-rc0-158-g34ab429 - I have utf8_unicode_ci on all MySQL > tables and I get question marks. > We've got 2.3 final on gerrit-dev on labs so we can test there. Want me to add you? I'm wondering if making the fields binary like we do in MediaWiki would work...but that's a bigger change than just the collations on the tables. > A bit newer Gerrit deployed on PostgreSQL is fine. I really don't see us moving to PG or H2, so we need to find a fix. I *refuse* to believe Gerrit that this is unfixable on MySQL.
Created attachment 10411 [details] Tell gerrit to use UTF-8 with MySQL My MySQL database is in UTF-8 and it sees that gerrit stores the values properly. A patch attached forces gerrit to use UTF-8 when connecting to MySQL.
^demon, can you try this change in the configuration (assuming we can have tables in UTF-8): [database] type = JDBC driver = com.mysql.jdbc.Driver url = jdbc:mysql://localhost/reviewdb?characterSetResults=utf8&characterEncoding=utf8&connectionCollation=utf8_unicode_ci username = gerrit2 "database" and "hostname" entries should be removed. "username" should stay.
*** Bug 35455 has been marked as a duplicate of this bug. ***
I don't think that a dataloss bug should be Low/Normal.
The following tables are definitely affected and need some sort of fix: account_external_ids accounts changes patch_comments These tables aren't currently affected, but could be if we put non-ASCII data into them. account_group_names account_groups approval_categories approval_category_values change_messages tracking_ids
Ok, collation has been updated on all tables, and https://gerrit.wikimedia.org/r/#change,6439 has been submitted to change the connection url.
(In reply to comment #13) > ^demon, can you try this change in the configuration (assuming we can have > tables in UTF-8): > > [database] > type = JDBC > driver = com.mysql.jdbc.Driver > url = > jdbc:mysql://localhost/reviewdb?characterSetResults=utf8&characterEncoding=utf8&connectionCollation=utf8_unicode_ci > username = Gerrit change #2 > > "database" and "hostname" entries should be removed. "username" should stay. Ok, I changed the collation/charset on all the tables, and we updated the connection string. The database is now showing the correct data (yay!), but we're still not getting the right data to the UI. See the owner on https://gerrit.wikimedia.org/r/#change,6388 which is an improvement although still not correct.
Looks "better" now. I would then try connecting to MySQL via JDBC directly and see if it's okay. You can try https://gerrit-review.googlesource.com/#/c/34670/ to play live with data obtained from the SQL database via Gerrit's ORM or play directly. You can try this code: http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/60187/focus=60206 to check what really JDBC sees. I hope you didn't end up with a double-encoded UTF-8 in the database (quite easy to do with MySQL, harder to recover) - so that Ś is not 0xC5 0x9A but 0xC3 0x85 0xC2 0x9A instead.
Some data from my MySQL instance: $ mysql -u root -p reviewdb --default-character-set=utf8 Enter password: Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1212 Server version: 5.0.92 FreeBSD port: mysql-server-5.0.92 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> \s -------------- mysql Ver 14.12 Distrib 5.0.92, for portbld-freebsd8.2 (amd64) using 5.2 Connection id: 1212 Current database: reviewdb Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.0.92 FreeBSD port: mysql-server-5.0.92 Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /tmp/mysql.sock Uptime: 21 days 8 hours 24 min 17 sec Threads: 6 Questions: 339557 Slow queries: 0 Opens: 85 Flush tables: 1 Open tables: 64 Queries per second avg: 0.184 -------------- mysql> show full columns from accounts; +----------------------------------------+--------------+-----------------+------+-----+-------------------+-------+---------------------------------+---------+ | Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment | +----------------------------------------+--------------+-----------------+------+-----+-------------------+-------+---------------------------------+---------+ | registered_on | timestamp | NULL | NO | | CURRENT_TIMESTAMP | | select,insert,update,references | | | full_name | varchar(255) | utf8_bin | YES | MUL | NULL | | select,insert,update,references | | | preferred_email | varchar(255) | utf8_bin | YES | MUL | NULL | | select,insert,update,references | | | contact_filed_on | timestamp | NULL | YES | | NULL | | select,insert,update,references | | | maximum_page_size | smallint(6) | NULL | NO | | 0 | | select,insert,update,references | | | show_site_header | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | use_flash_clipboard | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | download_url | varchar(20) | utf8_bin | YES | | NULL | | select,insert,update,references | | | download_command | varchar(20) | utf8_bin | YES | | NULL | | select,insert,update,references | | | copy_self_on_email | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | date_format | varchar(10) | utf8_bin | YES | | NULL | | select,insert,update,references | | | time_format | varchar(10) | utf8_bin | YES | | NULL | | select,insert,update,references | | | display_patch_sets_in_reverse_order | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | display_person_name_in_review_category | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | inactive | char(1) | utf8_unicode_ci | NO | | N | | select,insert,update,references | | | account_id | int(11) | NULL | NO | PRI | 0 | | select,insert,update,references | | +----------------------------------------+--------------+-----------------+------+-----+-------------------+-------+---------------------------------+---------+ 16 rows in set (0.02 sec) mysql> select full_name from accounts where preferred_email like 'saper%' \G *************************** 1. row *************************** full_name: Marcin Cieślak 1 row in set (0.00 sec)
Additionally, here's the output of my sane MySQL Gerrit instance via the Gerrit Inspector feature (patching your gerrit with https://gerrit-review.googlesource.com/#/c/34670/ should be mostly harmless :): (lost of startup messages on Gerrit console) "jettyserver" is "com.google.gerrit.pgm.http.jetty.JettyServer@1fdac8a5" "db" is "com.google.gerrit.reviewdb.server.ReviewDb_Schema_GwtOrm$$25@1c8aeedc" Welcome to the Gerrit Inspector Enter help() to see the above again, EOF to quit and stop Gerrit Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06) [OpenJDK 64-Bit Server VM (Sun Microsystems Inc.)] on java1.6.0 running for Gerrit 2.4-rc0-78-g8ed6c15 >>> for z in db.accounts().iterateAllEntities(): ... print z.fullName ... Marcin Cieślak Marcin Cieslak (via gmail) >>>
Um... I am unable to log in to gerrit right now. Application Error Server Error Cannot assign user name
Ok, everything should be squared away now. Usernames are now showing up properly[0], cover comments[1] and inline comments[2]. We also tested IRC--which works. E-mail notifs are working. Only thing left to test is new user creation and login. Then we can mark this fixed. [0] https://gerrit.wikimedia.org/r/#change,6008 [1] https://gerrit.wikimedia.org/r/#change,3962 (last comment) [2] https://gerrit.wikimedia.org/r/#patch,sidebyside,3962,4,RELEASE-NOTES-1.20
I can confirm logging in - works.
I've now created a user account via https://labsconsole.wikimedia.org/wiki/Special:CreateAccount for Paweł Sadowski and am waiting for Paweł to confirm that login for Labs & Gerrit works.
I went ahead and made myself a testing account so I can use it in the future. It worked https://gerrit.wikimedia.org/r/#dashboard,240 Marking this FIXED.
As of now, the IRC bot says: Lastlog: 04:42 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" [mediawiki/extensions/ProofreadPage] (master) C: 1; - https://gerrit.wikimedia.org/r/6345 04:49 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" [mediawiki/extensions/ProofreadPage] (master) C: 1; - https://gerrit.wikimedia.org/r/6340 13:39 < gerrit-wm> New patchset: Szymon ?wierkosz; "Convert a JS variable for horizontal layout to a preference." [mediawiki/extensions/ProofreadPage] (master) - https://gerrit.wikimedia.org/r/6388 13:39 < gerrit-wm> New patchset: Szymon ?wierkosz; "Bug fixed : the proofreadpage_default_layout='horizontal' option doesn't work because of a change in the html generated by wikieditor." [mediawiki/extensions/ProofreadPage] (master) - https://gerrit.wikimedia.org/r/6003 13:41 < gerrit-wm> New review: Szymon ?wierkosz; "Nothing changed between Patch Set 1 and Patch Set 2. It is one of my another failed attempts at usin..." [mediawiki/extensions/ProofreadPage] (master) C: 1; - https://gerrit.wikimedia.org/r/6003 20:27 < gerrit-wm> New patchset: Szymon ?wierkosz; "Convert a JS variable for horizontal layout to a preference." [mediawiki/extensions/ProofreadPage] (master) - https://gerrit.wikimedia.org/r/6388 20:27 < gerrit-wm> New patchset: Szymon ?wierkosz; "Bug fixed : the proofreadpage_default_layout='horizontal' option doesn't work because of a change in the html generated by wikieditor." [mediawiki/extensions/ProofreadPage] (master) - https://gerrit.wikimedia.org/r/6003 13:08 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" [mediawiki/core] (master) C: 0; - https://gerrit.wikimedia.org/r/6596 Fortunately, the HTML output seems fine - but something might have changed (is it because of 2.3)? Can you have a look at 2.3 database again? Maybe it's just some interface to the IRC bot?
Did a simple test: Added UTF-8 comment to: https://gerrit.wikimedia.org/r/#/c/3289/ results: $ ssh wikimedia gerrit stream-events {"type":"comment-added","change":{"project":"test/mediawiki/core","branch":"master","topic":"master","id":"Icdc8f7e26c4cba920eda69a042702b8358797554","number":"3289","subject":"Testing git review...","owner":{"name":"IAlex","email":"ialex.wiki@gmail.com"},"url":"https://gerrit.wikimedia.org/r/3289"},"patchSet":{"number":"1","revision":"e5e3aafbce66df1b0a1094be7aa62c34a617c181","ref":"refs/changes/89/3289/1","uploader":{"name":"IAlex","email":"ialex.wiki@gmail.com"},"createdOn":1332230770},"author":{"name":"saper","email":"saper@saper.info"},"comment":"ąćęłńóśźć comment utf-8"} But: 20:44 < gerrit-wm> New review: saper; "????????? comment utf-8" [test/mediawiki/core] (master) - https://gerrit.wikimedia.org/r/3289
(In reply to comment #27) > As of now, the IRC bot says: > > > Lastlog: > 04:42 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" > [mediawiki/extensions/ProofreadPage] (master) C: 1; - > https://gerrit.wikimedia.org/r/6345 > 04:49 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" > [mediawiki/extensions/ProofreadPage] (master) C: 1; - > https://gerrit.wikimedia.org/r/6340 > 13:39 < gerrit-wm> New patchset: Szymon ?wierkosz; "Convert a JS variable for > horizontal layout to a preference." [mediawiki/extensions/ProofreadPage] > (master) - https://gerrit.wikimedia.org/r/6388 > 13:39 < gerrit-wm> New patchset: Szymon ?wierkosz; "Bug fixed : the > proofreadpage_default_layout='horizontal' option doesn't work because of a > change in the > html generated by wikieditor." > [mediawiki/extensions/ProofreadPage] (master) - > https://gerrit.wikimedia.org/r/6003 > 13:41 < gerrit-wm> New review: Szymon ?wierkosz; "Nothing changed between Patch > Set 1 and Patch Set 2. It is one of my another failed attempts at usin..." > [mediawiki/extensions/ProofreadPage] (master) C: 1; - > https://gerrit.wikimedia.org/r/6003 > 20:27 < gerrit-wm> New patchset: Szymon ?wierkosz; "Convert a JS variable for > horizontal layout to a preference." [mediawiki/extensions/ProofreadPage] > (master) - https://gerrit.wikimedia.org/r/6388 > 20:27 < gerrit-wm> New patchset: Szymon ?wierkosz; "Bug fixed : the > proofreadpage_default_layout='horizontal' option doesn't work because of a > change in the > html generated by wikieditor." > [mediawiki/extensions/ProofreadPage] (master) - > https://gerrit.wikimedia.org/r/6003 > 13:08 < gerrit-wm> New review: Szymon ?wierkosz; "(no comment)" > [mediawiki/core] (master) C: 0; - https://gerrit.wikimedia.org/r/6596 > > > Fortunately, the HTML output seems fine - but something might have changed (is > it because of 2.3)? > > Can you have a look at 2.3 database again? Maybe it's just some interface to > the IRC bot? Could this be bug 36487?