Base64

Base64

·

4 min read

Base64

Base64 is an algorithm to encode binary data by only 64 characters(plus = as a placeholder). Let's first see how the algorithm works and them implement it from scratch.

Say we want to encode string ab into base64 string.

First, let encode this string into UTF8 encoding binary data first.

ab 
=> [97, 98]

Then we express it in binary format.

[97, 98]
=> [01100001, 01100010]

Then we split bits, each 6 bits into one group. Note that here, if the last group donesn't has enough 6 bits, just append 0's at the end.

[01100001, 01100010]
=> [011000, 010110, 001000]

Then we make every element as a valid uint8 value by padding 2 zeros from element's top.

[011000, 010110, 001000]
=> [00011000, 00010110, 00001000]

Let's see it now in 10 base format.

[00011000, 00010110, 00001000]
=> [24, 22, 8]

Next, we map each value with a table to get the corresponding characters. The table is as below.

// used for encode
const table = [
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
    'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',

    'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
    'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',

    '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',

    '+', '/'
];

// used for decode
const reverseTable = table.reduce((pre: Record<string, number>, cur: string, i: number) => {
    pre[cur] = i;
    return pre;
}, {});
// value as table's index
[24, 22, 8]
=> [Y, W, I]

At last, we append = at the end according to the value of original binary array's length % 3. If the value is 2, then append =. If the value is 1, then append ==.

That's all of it. Simple.

Code

Let's implement the process in code.

function encode(text: string): string {
    const before = new TextEncoder().encode(text);
    const after = new Uint8Array(Math.ceil(before.length * 8 / 6));

    let i = 0; // index of before
    let j = 7; // index of current bit(all bit)

    let x = 0; // index of after
    let y = 5; // index of current bit(right 6 bit)

    // bit index
    // 76543210 
    // 00000000

    while (true) {
        // current j, need to be y
        const diff = j - y;
        let v = (before[i] & (1 << j));
        if (diff >= 0) {
            v >>= diff;
        } else {
            v <<= -1 * diff;
        }
        after[x] |= v;

        if (j > 0) {
            j -= 1;
        } else {
            if (i < before.length - 1) {
                i += 1;
                j = 7;
            } else {
                break;
            }
        }

        if (y > 0) {
            y -= 1;
        } else {
            x += 1;
            y = 5;
        }
    }

    let result = [...after].map(i => table[i]);

    const remainder = before.length % 3;
    if (remainder === 2) {
        result.push("==");
    } else if (remainder === 1) {
        result.push("=");
    }

    return result.join("");
}

function decode(text: string): string {
    let result = [...text.replace(/=+$/, "")];
    let after = new Uint8Array(result.map(char => reverseTable[char]));
    let before = new Uint8Array(Math.floor(after.length * 6 / 8));

    let i = 0; // index of before
    let j = 7; // index of current bit(all bit)

    let x = 0; // index of after
    let y = 5; // index of current bit(right 6 bit)

    // bit index
    // 76543210 
    // 00000000

    while (true) {
        // current y, need to be j
        let v = (after[x] & (1 << y));
        const diff = y - j;
        if (diff >= 0) {
            v >>= diff;
        } else {
            v <<= -1 * diff;
        }
        before[i] |= v;

        if (j > 0) {
            j -= 1;
        } else {
            i += 1;
            j = 7;
        }

        if (y > 0) {
            y -= 1;
        } else {
            if (x < after.length - 1) {
                x += 1;
                y = 5;
            } else {
                break;
            }
        }
    }

    return new TextDecoder().decode(before);
}

Lastly, make some tests.

const text1 = "ab";
const encoded1 = encode(text1);
const decoded1 = decode(encoded1);
console.log({ text1, encoded1, decoded1 }, text1 === decoded1);

const text2 = "こ";
const encoded2 = encode(text2);
const decoded2 = decode(encoded2);
console.log({ text2, encoded2, decoded2 }, text2 === decoded2);

const text3 = "速对表";
const encoded3 = encode(text3);
const decoded3 = decode(encoded3);
console.log({ text3, encoded3, decoded3 }, text3 === decoded3);

And the result.

{ text1: "ab", encoded1: "YWI==", decoded1: "ab" } true
{ text2: "こ", encoded2: "44GT", decoded2: "こ" } true
{ text3: "速对表", encoded3: "6YCf5a+56KGo", decoded3: "速对表" } true