PHP json_encode Function Converts UTF-8 Strings to Hexadecimal Entities: Why and How to Resolve It
The PHP json_encode function can encode PHP variables into JSON (JavaScript Object Notation). However, when dealing with Unicode characters in UTF-8 encoding, the function may convert them to hexadecimal entities by default. This is because JSON does not natively support UTF-8 characters, which are multi-byte characters representing non-ASCII characters.
Why does PHP Convert UTF-8 Strings to Hexadecimal Entities?
PHP's json_encode function uses the JSON specification, which dictates that Unicode characters not representable as ASCII characters must be encoded using hexadecimal escape sequences. This ensures compatibility with older applications and devices that may not support UTF-8.
Resolving the Conversion Issue
To resolve this issue and avoid the conversion of UTF-8 strings to hexadecimal entities, PHP introduced the JSON_UNESCAPED_UNICODE option in version 5.4.0. This option instructs json_encode to use Unicode code points instead of hexadecimal escape sequences for non-ASCII characters.
Example
Suppose you have the following PHP script:
This code will output the following JSON string, where the Cyrillic characters are encoded as hexadecimal entities:
"\u0411\u0430\u0437\u0430 \u0434\u0430\u043d\u043d\u0438 \u0433\u0440\u0435\u0448\u043a\u0430."
To output the UTF-8 characters directly, you can use the JSON_UNESCAPED_UNICODE option:
This will produce the following JSON string, where the Cyrillic characters are represented using their Unicode code points:
"База данни грешка."
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3