"Si un ouvrier veut bien faire son travail, il doit d'abord affûter ses outils." - Confucius, "Les Entretiens de Confucius. Lu Linggong"
Page de garde > La programmation > How to Decode Base64 Strings in JavaScript While Handling UTF-8 Encoding?

How to Decode Base64 Strings in JavaScript While Handling UTF-8 Encoding?

Publié le 2024-11-08
Parcourir:947

How to Decode Base64 Strings in JavaScript While Handling UTF-8 Encoding?

Decode Base64 Using JavaScript atob Function: Handling UTF-8

JavaScript's atob() function is designed to decode base64-encoded strings. Users may encounter issues when decoding UTF-8 encoded strings, resulting in ASCII-encoded characters instead of proper UTF-8 representation.

Challenge: Understanding the Unicode Problem

Base64 expects binary data as input, and JavaScript considers strings with characters occupying one byte as binary data. Characters occupying more than one byte in UTF-8 encoded strings, however, trigger exceptions during encoding.

Solution 1: Binary Interoperability

The recommended fix is to encode to and decode binary strings:

Encoding UTF-8 to Binary

function toBinary(string) {
  const codeUnits = new Uint16Array(string.length);
  for (let i = 0; i < codeUnits.length; i++) {
    codeUnits[i] = string.charCodeAt(i);
  }
  return btoa(String.fromCharCode(...new Uint8Array(codeUnits.buffer)));
}

Decoding Binary to UTF-8

function fromBinary(encoded) {
  const binary = atob(encoded);
  const bytes = new Uint8Array(binary.length);
  for (let i = 0; i < bytes.length; i++) {
    bytes[i] = binary.charCodeAt(i);
  }
  return String.fromCharCode(...new Uint16Array(bytes.buffer));
}

This solution converts the original UTF-8 string to a binary representation, preserving UTF-16 encoding, a native representation in JavaScript.

Solution 2: ASCII Base64 Interoperability

An alternative solution focused on UTF-8 interoperability is to maintain plaintext base64 strings:

Encoding UTF-8 to Base64

function b64EncodeUnicode(str) {    
  return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
    function toSolidBytes(match, p1) {
      return String.fromCharCode('0x' + p1);
  }));
}

Decoding Base64 to UTF-8

function b64DecodeUnicode(str) {
  return decodeURIComponent(atob(str).split('').map(function(c) {
    return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
  }).join(''));
}

This solution efficiently handles UTF-8 encoded strings without altering their representation.

TypeScript Support

// Encoding UTF-8 ⇢ base64

function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode(parseInt(p1, 16))
    }))
}

// Decoding base64 ⇢ UTF-8

function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
    }).join(''))
}

Historical Solution (Deprecated)

function utf8_to_b64( str ) {
  return window.btoa(unescape(encodeURIComponent( str )));
}

function b64_to_utf8( str ) {
  return decodeURIComponent(escape(window.atob( str )));
}

While still functional, this approach is now deprecated in modern browsers.

Dernier tutoriel Plus>

Clause de non-responsabilité: Toutes les ressources fournies proviennent en partie d'Internet. En cas de violation de vos droits d'auteur ou d'autres droits et intérêts, veuillez expliquer les raisons détaillées et fournir une preuve du droit d'auteur ou des droits et intérêts, puis l'envoyer à l'adresse e-mail : [email protected]. Nous nous en occuperons pour vous dans les plus brefs délais.

Copyright© 2022 湘ICP备2022001581号-3