1. Overview

Base32 is an encoding and decoding scheme that might be an alternative to the well-known Base64 encoding and decoding scheme. Base32 has several variants which have different but similar alphabets.

In this tutorial, we’ll discuss a specific variant of Base32, namely Crockford’s Base32. We’ll also see an example using this scheme.

2. What Is Crockford’s Base32?

Base32 is an encoding and decoding scheme that uses an alphabet consisting of 32 symbols. Therefore, each symbol in the alphabet consists of 5 bits. Base64, on the other hand, has an alphabet with 64 symbols, i.e., it uses symbols consisting of 6 bits.

Base32 aims to be more readable than Base64 during message transmission between computer systems and humans. It’s an optimization between Base64 and hexadecimal representation.

Base32 has several variants such as Base32 RFC 4648 and z-base-32. Crockford’s Base32 is one of those variants. Its name stems from its creator, Douglas Crockford.

2.1. Crockford’s Base32 Alphabet

Crockford’s Base32 consists of the following symbols:

Value

Encode Digit

Decode Digit

Value

Encode Digit

Decode Digit

0

0

0 o O

16

G

g G

1

1

1 i I l L

17

H

h H

2

2

2

18

J

j J

3

3

3

19

K

k K

4

4

4

20

M

m M

5

5

5

21

N

n N

6

6

6

22

P

p P

7

7

7

23

Q

q Q

8

8

8

24

R

r R

9

9

9

25

S

s S

10

A

a A

26

T

t T

11

B

b B

27

V

v V

12

C

c C

28

W

w W

13

D

d D

29

X

x X

14

E

e E

30

Y

y Y

15

F

f F

31

Z

z Z

The alphabet consists of 10 digits and 22 letters. Contrary to Base64, the encoded string consists of only capital letters. However, it lets us use small letters while decoding. Besides, Crockford’s Base32 excludes the I, L, O, and U letters from the alphabet for several reasons. For example, the O letter isn’t included as it might be confused with the 0 digit.

It’s also possible to use hyphens in the encoded string to improve readability. However, we need to skip these hyphens while decoding.

The length of a string encoded with Base32 is roughly 20% longer than its Base64 counterpart.

2.2. Encoding and Decoding Process

We first need to convert the string to be encoded to an array of bits. We use the ASCII code of each character in the string to convert the string to binary numbers. Then, we concatenate the binary numbers and split them into 5-bit chunks starting from the left, i.e., the most significant bit. Finally, we encode each 5-bit chunk using the symbols in Crockford’s Base32 table.

If the number of bits in the last chunk is less than 5, we pad it with zeros to complete its length to 5.

The decoding process is basically the reverse of the encoding process.

2.3. Error Detection

Optionally, it’s possible to append a check symbol to the end of the encoded string to detect errors. For this purpose, we calculate the modulo 37 of the whole number corresponding to the encoded string and add the corresponding symbol to the end. Modulo 37 is used since 37 is the smallest prime number greater than 32. As modulo 37 results in a number less than 37, Crockford’s Base32 needs five additional symbols for 32, 33, 34, 35, and 36. It uses the following symbols for the checksum:

Symbol Value

Decode Symbol

Encode Symbol

32

*

*

33

~

~

34

$

$

35

=

=

36

U u

U

3. An Example

Let’s now encode the string “BAELDUNG”, using Crockford’s Base32.

We list the ASCII code of each character in “BAELDUNG” together with the binary representation in the following table:

Character

B

A

E

L

D

U

N

G

Decimal

66

65

69

76

68

85

78

71

Binary

01000010

01000001

01000101

01001100

01000100

01010101

01001110

01000111

For example, the ASCII code of the character ‘B’ is 66 and its binary value is 01000010.

Next, we concatenate the binary values starting from the left and then group them as 5-bit chunks. Finally, we find the corresponding symbol in the alphabet for each 5-bit chunk:

Binary

01000

01001

00000

10100

01010

10011

00010

00100

01010

10101

00111

00100

01110

Decimal

8

9

0

20

10

19

2

4

10

21

7

4

14

Crockford’s Base32 Symbol

8

9

0

M

A

K

2

4

A

N

7

4

E

Consequently, the encoded string corresponding to “BAELDUNG” is “890MAK24AN74E” in Crockford’s Base32.

If we want to add a checksum, then we need to calculate the modulo 37 of the decimal number corresponding to “890MAK24AN74E”. The corresponding decimal number is 9548346547711417486. The modulo 37 of this number is 30, and its symbol is Y according to Crockford’s Base32 alphabet. Therefore, the encoded string together with the checksum is “890MAK24AN74EY”.

4. Conclusion

In this article, we discussed Crockford’s Base32 encoding and decoding scheme. Firstly, we learned that it’s the 32-bit counterpart of Base64. Crockford’s Base32 is one of the several variants of Base32. We saw that it aims to be more readable than Base64. Finally, we saw an example encoding the string “BAELDUNG” using Crockford’s Base32.