my connection string has the following properties useUnicode=true&characterEncoding=utf8&character_set_server=utf8mb4&charset=utf8mb4
and I used
jpaProperties.put("hibernate.connection.useUnicode", true);
jpaProperties.put("hibernate.connection.characterEncoding", "utf8");
jpaProperties.put("hibernate.connection.CharSet", "utf8mb4");
the DB also supports utf8mb4 since when I add a record manually it saves it correctly
still getting errors when trying to save an emoji
Incorrect string value: '\xF0\x9F\x98\x88\xF0\x9F...' for column 'name' at row 1
I'm quite confident that you clearly expressed your intention to use UTF-8 against your whole technology stack... except your data.
Your real issue here is that your data (the original string) is not valid UTF-8 to begin with. You can easily verify this with the following snippet:
public static boolean isValidUTF8(byte[] input) {
CharsetDecoder utf8Decoder = Charset.forName("UTF-8").newDecoder();
try {
utf8Decoder.decode(ByteBuffer.wrap(input));
return true;
} catch (CharacterCodingException e) {
return false;
}
}
You should instead use utf8mb4
all the way (including your column definition, which should be ... CHARSET=utf8mb4 COLLATE utf8mb4_general_ci
or ... CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
).
You need to pay extra attention on MySQL Connector version (and configuration) according to the documentation.
I Solved it by upgrading the mysql-connector-java to 5.1.49 and adding the following to the connection string
{connection string}?characterEncoding=UTF-8&useUnicode=true
reference:https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-charsets.html