JavaScript's atob() function is designed to decode base64-encoded strings. Users may encounter issues when decoding UTF-8 encoded strings, resulting in ASCII-encoded characters instead of proper UTF-8 representation.
Base64 expects binary data as input, and JavaScript considers strings with characters occupying one byte as binary data. Characters occupying more than one byte in UTF-8 encoded strings, however, trigger exceptions during encoding.
The recommended fix is to encode to and decode binary strings:
Encoding UTF-8 to Binary
function toBinary(string) { const codeUnits = new Uint16Array(string.length); for (let i = 0; i < codeUnits.length; i++) { codeUnits[i] = string.charCodeAt(i); } return btoa(String.fromCharCode(...new Uint8Array(codeUnits.buffer))); }
Decoding Binary to UTF-8
function fromBinary(encoded) { const binary = atob(encoded); const bytes = new Uint8Array(binary.length); for (let i = 0; i < bytes.length; i++) { bytes[i] = binary.charCodeAt(i); } return String.fromCharCode(...new Uint16Array(bytes.buffer)); }
This solution converts the original UTF-8 string to a binary representation, preserving UTF-16 encoding, a native representation in JavaScript.
An alternative solution focused on UTF-8 interoperability is to maintain plaintext base64 strings:
Encoding UTF-8 to Base64
function b64EncodeUnicode(str) { return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function toSolidBytes(match, p1) { return String.fromCharCode('0x' + p1); })); }
Decoding Base64 to UTF-8
function b64DecodeUnicode(str) { return decodeURIComponent(atob(str).split('').map(function(c) { return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2); }).join('')); }
This solution efficiently handles UTF-8 encoded strings without altering their representation.
// Encoding UTF-8 ⇢ base64 function b64EncodeUnicode(str) { return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) { return String.fromCharCode(parseInt(p1, 16)) })) } // Decoding base64 ⇢ UTF-8 function b64DecodeUnicode(str) { return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) { return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2) }).join('')) }
function utf8_to_b64( str ) { return window.btoa(unescape(encodeURIComponent( str ))); } function b64_to_utf8( str ) { return decodeURIComponent(escape(window.atob( str ))); }
While still functional, this approach is now deprecated in modern browsers.
Haftungsausschluss: Alle bereitgestellten Ressourcen stammen teilweise aus dem Internet. Wenn eine Verletzung Ihres Urheberrechts oder anderer Rechte und Interessen vorliegt, erläutern Sie bitte die detaillierten Gründe und legen Sie einen Nachweis des Urheberrechts oder Ihrer Rechte und Interessen vor und senden Sie ihn dann an die E-Mail-Adresse: [email protected] Wir werden die Angelegenheit so schnell wie möglich für Sie erledigen.
Copyright© 2022 湘ICP备2022001581号-3