Recovering from Double-Encoded UTF8 Characters
Data previously imported using LOAD DATA INFILE was mistakenly assumed to be encoded in Latin1. Consequently, multibyte characters were split into individual bytes and subsequently encoded in UTF8, resulting in double-encoding. This leads to corrupted characters such as "ñ" instead of "ñ".
To rectify these errors, a special MySQL function can be utilized to restore the correct UTF8 strings after double-encoding:
CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8)
This function can be integrated into an UPDATE statement to correct the affected fields:
UPDATE tablename SET field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);
By executing this statement, the database will replace the double-encoded values with their correct UTF8 counterparts, resolving the character anomalies.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3