Last modified: 2008-08-19 16:50:53 UTC
Observe in the "comment" box on the above URL, how a UTF-8 3 byte long character has been truncated. Please consider characters, not bytes, in determining where to truncate. Else you produce invalid characters: not ASCII, not UTF-8, which show up with the invalid character symbol in browsers.
Also please be sure you don't truncate UTF-8 when including snippets which I often see in e.g., my http://commons.wikimedia.org/wiki/Special:Contributions/Jidanni !
img_description tinyblob NOT NULL This field is binary and does not store any encoding along with the data. That's why when it comes to truncating the string to fit in the field (255 bytes), SQL do not check if it breaks encoding do so. In my opinion, using a varchar(255), which actually stores in the field the encoding, would solve that problem.
That's incorrect; VARCHAR would have the same issue. Note that we do not use MySQL's utterly broken UTF-8 support as it does not actually support UTF-8, but only a limited subset of UTF-8. As a result, we use binary fields for data safety. As with the related bugs, the correct fix is to apply UTF-8-safe truncation on input data that's destined for short fields.
Isn't this duplicate of bug 332?
I guess so. *** This bug has been marked as a duplicate of bug 332 ***