String based data encoding: Base64 vs Base64url

Suragch

What is the difference between Base64 and Base64url that I see in things like JSON web tokens?

Suragch

Both Base64 and Base64url are ways to encode binary data in string form. You can read about the theory of base64 here. The problem with Base64 is that it contains the characters +, /, and =, which have a reserved meaning in some filesystem names and URLs. So base64url solves this by replacing + with - and / with _. The trailing padding character = can be eliminated when not needed, but in a URL it would instead most likely be % URL encoded. Then the encoded data can be included in a URL without problems.

Here is a chart of the differences:

Index  Base64  Base64Url

0      A       A 
1      B       B 
2      C       C 
3      D       D 
4      E       E 
5      F       F 
6      G       G 
7      H       H 
8      I       I 
9      J       J 
10     K       K 
11     L       L 
12     M       M 
13     N       N 
14     O       O 
15     P       P 
16     Q       Q 
17     R       R 
18     S       S 
19     T       T 
20     U       U 
21     V       V 
22     W       W 
23     X       X 
24     Y       Y 
25     Z       Z 
26     a       a 
27     b       b 
28     c       c 
29     d       d 
30     e       e 
31     f       f 
32     g       g 
33     h       h 
34     i       i 
35     j       j 
36     k       k 
37     l       l 
38     m       m 
39     n       n 
40     o       o 
41     p       p 
42     q       q 
43     r       r 
44     s       s 
45     t       t 
46     u       u 
47     v       v 
48     w       w
49     x       x
50     y       y
51     z       z
52     0       0
53     1       1
54     2       2
55     3       3
56     4       4
57     5       5
58     6       6
59     7       7
60     8       8
61     9       9
62     +       -
63     /       _

       =       (optional)

Below I will quote the definitions from the standards.

RCF 4648 specs

4. Base 64 Encoding

The following description of base 64 is derived from 3, [4], [5], and [6]. This encoding may be referred to as "base64".

The Base 64 encoding is designed to represent arbitrary sequences of octets in a form that allows the use of both upper- and lowercase letters but that need not be human readable.

A 65-character subset of US-ASCII is used, enabling 6 bits to be
represented per printable character. (The extra 65th character, "=", is used to signify a special processing function.)

The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right, a 24-bit input group is formed by concatenating 3 8-bit input groups. These 24 bits are then treated as 4 concatenated 6-bit groups, each of which is translated into a single character in the base 64 alphabet.

Each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the
output string.

                  Table 1: The Base 64 Alphabet

 Value Encoding  Value Encoding  Value Encoding  Value Encoding
     0 A            17 R            34 i            51 z
     1 B            18 S            35 j            52 0
     2 C            19 T            36 k            53 1
     3 D            20 U            37 l            54 2
     4 E            21 V            38 m            55 3
     5 F            22 W            39 n            56 4
     6 G            23 X            40 o            57 5
     7 H            24 Y            41 p            58 6
     8 I            25 Z            42 q            59 7
     9 J            26 a            43 r            60 8
    10 K            27 b            44 s            61 9
    11 L            28 c            45 t            62 +
    12 M            29 d            46 u            63 /
    13 N            30 e            47 v
    14 O            31 f            48 w         (pad) =
    15 P            32 g            49 x
    16 Q            33 h            50 y

Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full encoding quantum is
always completed at the end of a quantity. When fewer than 24 input
bits are available in an input group, bits with value zero are added
(on the right) to form an integral number of 6-bit groups. Padding
at the end of the data is performed using the '=' character. Since
all base 64 input is an integral number of octets, only the following cases can arise:

(1) The final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding.

(2) The final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters.

(3) The final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character.

5. Base 64 Encoding with URL and Filename Safe Alphabet

The Base 64 encoding with an URL and filename safe alphabet has been used in [12].

An alternative alphabet has been suggested that would use "~" as the 63rd character. Since the "~" character has special meaning in some file system environments, the encoding described in this section is recommended instead. The remaining unreserved URI character is ".", but some file system environments do not permit multiple "." in a filename, thus making the "." character unattractive as well.

The pad character "=" is typically percent-encoded when used in an URI [9], but if the data length is known implicitly, this can be
avoided by skipping the padding; see section 3.2.

This encoding may be referred to as "base64url". This encoding
should not be regarded as the same as the "base64" encoding and
should not be referred to as only "base64". Unless clarified
otherwise, "base64" refers to the base 64 in the previous section.

This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in Table 2.

     Table 2: The "URL and Filename safe" Base 64 Alphabet

 Value Encoding  Value Encoding  Value Encoding  Value Encoding
     0 A            17 R            34 i            51 z
     1 B            18 S            35 j            52 0
     2 C            19 T            36 k            53 1
     3 D            20 U            37 l            54 2
     4 E            21 V            38 m            55 3
     5 F            22 W            39 n            56 4
     6 G            23 X            40 o            57 5
     7 H            24 Y            41 p            58 6
     8 I            25 Z            42 q            59 7
     9 J            26 a            43 r            60 8
    10 K            27 b            44 s            61 9
    11 L            28 c            45 t            62 - (minus)
    12 M            29 d            46 u            63 _
    13 N            30 e            47 v           (underline)
    14 O            31 f            48 w
    15 P            32 g            49 x
    16 Q            33 h            50 y         (pad) =

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Base64 String Encoding C# vs TSQL

From Dev

How to choose between base 64 and base64url encoding

From Dev

Base64 encoding of String in Qt 5

From Dev

Base64 Encoding difference in a particular String

From Dev

Python Base64 encoding Vs Java Base64 encoding

From Dev

node.js base64 encoding a string yields slightly different results (local vs heroku)

From Dev

Is data:image/png;base64 the only base64 encoding for the img tag?

From Dev

Base64 encoding or bin2hex random string

From Dev

OutOfMemoryError while decoding and encoding Base64 String into Bitmap

From Dev

Android encoding PNG to base64 string producing invalid results

From Dev

Base64 encoding of String in Node.js

From Dev

Cannot find the right encoding parameters (string to base64)

From Dev

Base64 encoding of a SHA256 string

From Java

Aes decrypt file java with base64 data encoding

From Dev

Stylus inline SVG data uri without base64 encoding

From Dev

Issue with base64 encoding of data using HMAC

From Dev

Browser-based local file-system to SVG base64 data string

From Dev

Base64 - Encoding and Decoding

From Java

Encoding as Base64 in Java

From Dev

Base64 encoding not working

From Dev

Base64 encoding fails

From Dev

Base64 - Invalid Encoding

From Dev

Java: Different results when decoding base64 string with java.util.Base64 vs android.util.Base64

From Dev

Nodejs post base64 string as form data

From Dev

DatatypeConverter vs Base64

From Dev

DatatypeConverter vs Base64

From Dev

how long is 24bytes after base64url encoding

From Dev

Improper base64url

From Dev

Convert base64 encoding to pdf