1. Overview
Converting textual data into hexadecimal has various applications in data analysis and computing.
In this tutorial, we’ll learn several techniques to convert a string to hexadecimal using command-line utilities in Linux.
2. Hexadecimal System
The hexadecimal system is a base-16 numeral system for representing data. It uses the digits 0-9 and alphabets A–F for numbers 10–15.
Let’s take a look at the hexadecimal values, along with their decimal values, for a few alphabets, namely, a, b, and c:
Character
Decimal (lowercase)
Decimal (uppercase)
Hexadecimal (lowercase)
Hexadecimal (uppercase)
a
97
65
61
41
b
98
66
62
42
c
99
67
63
43
We can see that alphabet, *a with the decimal value of 97, is represented by a hexadecimal value, 41 (16^0**1+16^2*4)**. Similar logic applies to other alphabets.
Further, let’s also check the hexadecimal values for a few more characters that represent numbers:
Character
Decimal
Hexadecimal
0
48
30
1
49
31
2
50
32
As earlier, 0*, with the decimal value of 48, is represented by a hexadecimal value, *30 (16^0*0 + 16^1*3). Moreover, we can infer that numbers 0-9 have continuous series of hexadecimal values 30-39.
3. Using xxd Utility
Let’s start by learning about the xxd utility to get the hexadecimal value of the text.
3.1. Setup
The xxd utility is a part of the vim package on many Linux distributions. So, to install the xxd utility, we need to install the vim-common package:
$ apt install vim-common -y
Now, let’s go ahead and verify that the xxd utility is available on the system:
$ xxd --version
xxd 2021-10-22 by Juergen Weigert et al.
It’s ready for use!
3.2. Default Behavior
Let’s use the xxd command to get the hexadecimal value of the “hello” string passed via stdin:
$ echo -n "hello" | xxd
00000000: 6865 6c6c 6f hello
We can see that xxd shows output in three columns. Firstly, it shows the offset of data in the leftmost column. Then after, it shows the hexadecimal value, and in the rightmost column, it shows the ASCII value of the text.
3.3. With the -p Option
We can use the -p option to restrict the output to the hexadecimal value without the offset and ASCII values:
$ echo -n "hello" | xxd -p
68656c6c6f
That’s it! We’ve got the desired result.
4. Using od Utility
In this section, let’s explore the octal dump (od) utility to convert text into hexadecimal content.
4.1. With the -x Option
As the name suggests, the default behavior of the octal dump (od) utility is to give the octal representation. However, we can use the -x option to get the hexadecimal value of the content.
Let’s go ahead and get the hexadecimal value for the “hello” string:
$ echo -n "hello" | od -x
0000000 6568 6c6c 006f
0000005
It’s worth noting that the od utility shows the hexadecimal values on the right and offset values on the left side. Furthermore, it shows the hexadecimal values in the little-endian format for reading two consecutive bytes.
4.2. With the -t Option
Usually, we’d want to see the hexadecimal content in the big-endian format. For this purpose, we can pass the -t x1 flag as an argument:
$ echo -n "hello" | od -t x1
0000000 68 65 6c 6c 6f
0000005
Perfect! We can sequentially map hexadecimal values to the characters in the “hello” string.
4.3. With the -An Option
Let’s now exclude the offset values on the left side so that we don’t have any noise in our output. To do so, we can use the -An option:
$ echo -n "hello" | od -An -t x1
68 65 6c 6c 6f
The output looks much more concise now, except for the additional spaces.
Lastly, let’s use the tr command to get rid of the spaces:
# echo -n "hello" | od -An -t x1 | tr -d ' '
68656c6c6f
Fantastic! It looks like we nailed this one.
5. Using hexdump Utility
In this section, we’ll learn how to use the hexdump utility for converting text to hexadecimal output.
5.1. Setup
An average user is unlikely to use an advanced tool like hexdump. As a result, many Linux distributions don’t have the hexdump utility preinstalled, which also helps to keep the base system minimal.
To install the hexdump utility on Ubuntu, we can install the bsdmainutils package:
$ apt-get install bsdmainutils -y
Further, let’s verify that hexdump is available for use after the package installation:
$ hexdump --version
hexdump from util-linux 2.37.2
Great! It’s ready for use.
5.2. Default Behavior
Let’s start by using the hexdump utility to convert the “hello” string into hexadecimal content:
$ echo -n 'hello' | hexdump
0000000 6568 6c6c 006f
0000005
We can notice that hexdump shows the output in a two-column format. While the first column represents the byte offsets, the second column shows the hexadecimal characters.
Furthermore, it shows the output in a two-byte little-endian format where LSB (least significant byte) is stored first. As a result, at first glance, it seems that hexdump is swapping the bytes.
Nonetheless, if the reader is unaware of the default behavior, it isn’t apparent to find the hexadecimal value 65 before 68 for the characters h and e, respectively. So, let’s go ahead and address this issue.
5.3. With the –format Option
We can use the –format option to specify how hexdump should process and display the output:
$ hexdump --format '[iteration count]/[byte count] "[format specifier]"'
Before moving further, let’s understand the different components of the format string from left to right.
Firstly, the iteration count indicates how many times the format specifier should be applied. Secondly, the byte count indicates the number of bytes to consume at a time. Lastly, the format specifier indicates how the bytes specified by the byte count should be displayed on the output.
Now, it’s time to apply this learning to specify the output format such that hexdump reads one byte at a time (/1*) and shows the output in hexadecimal format (*%x)**:
$ echo -n "hello" | hexdump --format '/1 "%x"'
68656c*
6f
We can see a sequential match between the hexadecimal values in the output and input characters. However, we see that the hexadecimal value 6c that represents the character “l” appears only once, and there is an * (asterisk) after it. That’s because hexdump is squeezing the repeated characters.
Lastly, let’s see an obvious drawback to the squeezing behavior by checking hexadecimal output for three strings, namely, “helllo“, “hellllo“, and “helllllo“:
$ echo -n "helllo" | hexdump --format '/1 "%02x"'
68656c*
$ echo -n "hellllo" | hexdump --format '/1 "%02x"'
68656c*
$ echo -n "helllllo" | hexdump --format '/1 "%02x"'
68656c*
6f
We can see that the output for all three input strings is the same as hexdump only indicates the repetition of bytes but doesn’t give us the count of repeated bytes. Let’s see how to resolve this issue.
5.4. With the –format and –no-squeezing Options
To disable the squeezing behavior, we can use the –no-squeezing option together with the –format option:
$ echo -n "hello" | hexdump --no-squeezing --format '/1 "%02x"'
68656c6c6f
We find that “6c“, the hexadecimal value for “l” appears twice without any * symbol.
Further, let’s also verify that this approach does give work as expected for the longer repetition of characters in the strings, “helllo“, “hellllo“, and “helllllo“:
$ echo -n "helllo" | hexdump --no-squeezing --format '/1 "%02x"'
68656c6c6c6f
$ echo -n "hellllo" | hexdump --no-squeezing --format '/1 "%02x"'
68656c6c6c6c6f
$ echo -n "helllllo" | hexdump --no-squeezing --format '/1 "%02x"'
68656c6c6c6c6c6f
Fantastic! It works as expected.
6. Conclusion
In this article, we learned how to convert text into hexadecimal values. Furthermore, we explored different utilities, such as xxd, od, and hexdump, to solve our use case.