JavaScript's atob() function is designed to decode base64-encoded strings. Users may encounter issues when decoding UTF-8 encoded strings, resulting in ASCII-encoded characters instead of proper UTF-8 representation.
Base64 expects binary data as input, and JavaScript considers strings with characters occupying one byte as binary data. Characters occupying more than one byte in UTF-8 encoded strings, however, trigger exceptions during encoding.
The recommended fix is to encode to and decode binary strings:
Encoding UTF-8 to Binary
function toBinary(string) { const codeUnits = new Uint16Array(string.length); for (let i = 0; iDecoding Binary to UTF-8
function fromBinary(encoded) { const binary = atob(encoded); const bytes = new Uint8Array(binary.length); for (let i = 0; iThis solution converts the original UTF-8 string to a binary representation, preserving UTF-16 encoding, a native representation in JavaScript.
Solution 2: ASCII Base64 Interoperability
An alternative solution focused on UTF-8 interoperability is to maintain plaintext base64 strings:
Encoding UTF-8 to Base64
function b64EncodeUnicode(str) { return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function toSolidBytes(match, p1) { return String.fromCharCode('0x' p1); })); }Decoding Base64 to UTF-8
function b64DecodeUnicode(str) { return decodeURIComponent(atob(str).split('').map(function(c) { return '%' ('00' c.charCodeAt(0).toString(16)).slice(-2); }).join('')); }This solution efficiently handles UTF-8 encoded strings without altering their representation.
TypeScript Support
// Encoding UTF-8 ⇢ base64 function b64EncodeUnicode(str) { return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) { return String.fromCharCode(parseInt(p1, 16)) })) } // Decoding base64 ⇢ UTF-8 function b64DecodeUnicode(str) { return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) { return '%' ('00' c.charCodeAt(0).toString(16)).slice(-2) }).join('')) }Historical Solution (Deprecated)
function utf8_to_b64( str ) { return window.btoa(unescape(encodeURIComponent( str ))); } function b64_to_utf8( str ) { return decodeURIComponent(escape(window.atob( str ))); }While still functional, this approach is now deprecated in modern browsers.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3