1. Overview
Base32 is an encoding and decoding scheme that might be an alternative to the well-known Base64 encoding and decoding scheme. Base32 has several variants which have different but similar alphabets.
In this tutorial, we’ll discuss a specific variant of Base32, namely Crockford’s Base32. We’ll also see an example using this scheme.
2. What Is Crockford’s Base32?
Base32 is an encoding and decoding scheme that uses an alphabet consisting of 32 symbols. Therefore, each symbol in the alphabet consists of 5 bits. Base64, on the other hand, has an alphabet with 64 symbols, i.e., it uses symbols consisting of 6 bits.
Base32 aims to be more readable than Base64 during message transmission between computer systems and humans. It’s an optimization between Base64 and hexadecimal representation.
Base32 has several variants such as Base32 RFC 4648 and z-base-32. Crockford’s Base32 is one of those variants. Its name stems from its creator, Douglas Crockford.
2.1. Crockford’s Base32 Alphabet
Crockford’s Base32 consists of the following symbols:
Value
Encode Digit
Decode Digit
Value
Encode Digit
Decode Digit
0
0
0 o O
16
G
g G
1
1
1 i I l L
17
H
h H
2
2
2
18
J
j J
3
3
3
19
K
k K
4
4
4
20
M
m M
5
5
5
21
N
n N
6
6
6
22
P
p P
7
7
7
23
Q
q Q
8
8
8
24
R
r R
9
9
9
25
S
s S
10
A
a A
26
T
t T
11
B
b B
27
V
v V
12
C
c C
28
W
w W
13
D
d D
29
X
x X
14
E
e E
30
Y
y Y
15
F
f F
31
Z
z Z
The alphabet consists of 10 digits and 22 letters. Contrary to Base64, the encoded string consists of only capital letters. However, it lets us use small letters while decoding. Besides, Crockford’s Base32 excludes the I, L, O, and U letters from the alphabet for several reasons. For example, the O letter isn’t included as it might be confused with the 0 digit.
It’s also possible to use hyphens in the encoded string to improve readability. However, we need to skip these hyphens while decoding.
The length of a string encoded with Base32 is roughly 20% longer than its Base64 counterpart.
2.2. Encoding and Decoding Process
We first need to convert the string to be encoded to an array of bits. We use the ASCII code of each character in the string to convert the string to binary numbers. Then, we concatenate the binary numbers and split them into 5-bit chunks starting from the left, i.e., the most significant bit. Finally, we encode each 5-bit chunk using the symbols in Crockford’s Base32 table.
If the number of bits in the last chunk is less than 5, we pad it with zeros to complete its length to 5.
The decoding process is basically the reverse of the encoding process.
2.3. Error Detection
Optionally, it’s possible to append a check symbol to the end of the encoded string to detect errors. For this purpose, we calculate the modulo 37 of the whole number corresponding to the encoded string and add the corresponding symbol to the end. Modulo 37 is used since 37 is the smallest prime number greater than 32. As modulo 37 results in a number less than 37, Crockford’s Base32 needs five additional symbols for 32, 33, 34, 35, and 36. It uses the following symbols for the checksum:
Symbol Value
Decode Symbol
Encode Symbol
32
*
*
33
~
~
34
$
$
35
=
=
36
U u
U
3. An Example
Let’s now encode the string “BAELDUNG”, using Crockford’s Base32.
We list the ASCII code of each character in “BAELDUNG” together with the binary representation in the following table:
Character
B
A
E
L
D
U
N
G
Decimal
66
65
69
76
68
85
78
71
Binary
01000010
01000001
01000101
01001100
01000100
01010101
01001110
01000111
For example, the ASCII code of the character ‘B’ is 66 and its binary value is 01000010.
Next, we concatenate the binary values starting from the left and then group them as 5-bit chunks. Finally, we find the corresponding symbol in the alphabet for each 5-bit chunk:
Binary
01000
01001
00000
10100
01010
10011
00010
00100
01010
10101
00111
00100
01110
Decimal
8
9
0
20
10
19
2
4
10
21
7
4
14
Crockford’s Base32 Symbol
8
9
0
M
A
K
2
4
A
N
7
4
E
Consequently, the encoded string corresponding to “BAELDUNG” is “890MAK24AN74E” in Crockford’s Base32.
If we want to add a checksum, then we need to calculate the modulo 37 of the decimal number corresponding to “890MAK24AN74E”. The corresponding decimal number is 9548346547711417486. The modulo 37 of this number is 30, and its symbol is Y according to Crockford’s Base32 alphabet. Therefore, the encoded string together with the checksum is “890MAK24AN74EY”.
4. Conclusion
In this article, we discussed Crockford’s Base32 encoding and decoding scheme. Firstly, we learned that it’s the 32-bit counterpart of Base64. Crockford’s Base32 is one of the several variants of Base32. We saw that it aims to be more readable than Base64. Finally, we saw an example encoding the string “BAELDUNG” using Crockford’s Base32.