1. Introduction

In this tutorial, we’ll explain the term octet strings.

2. Octets vs. Bytes

An octet string refers to a collection of related octets, where an octet is a unit of 8 bits. We use “octet” when the term “byte,” meaning 8 bytes, may give way to confusion.

Both terms mean 8 bits and a byte is a more popular term than an octet. However, when dealing with legacy systems, the interpretation of the term byte may be unclear because the meaning of the term has historically been platform-dependent. It only began getting popularity as 8 bits around the early 1980s.

Just like a byte can be prefixed with Kilo (K), Mega (M), and so on to reference its multiples, so can an octet. For example, Mo stands for megaoctet, and Go for gigaoctet.

Another term worth noting is “nibble”. Computer scientists typically use this to refer to 4 bits rather than calling them octet halves.

3. Where Do We Use Octet Strings?

3.1. IP Addressing

We use octet strings in IPv4 addressing. These 32-bit IP addresses consist of 8-bit sections. Therefore, an IPv4 address has four octets:

IPv4 address format diagram

Each section’s value is written in the decimal format; hence, it can have values from 0 to 255 inclusive. Here’s an example:

192.16.127.132

In IPv6, we have 16 octets instead. There are several forms that an IPv6 octet string can take. However, the one recommended is:

n:n:n:n:n:n:n:n

In the above colon format, each n represents a 16-bit value and has two octets.

Unlike IPv4, we use a hexadecimal number format in IPv6. Here’s an example address:

3F2E:FEF:7654:F89A:125:BA8:3910:9062

Generally, the Internet Engineering Task Force (IETF) uses octet strings to specify various aspects of network communication, such as addressing and data sizes.

3.2. Network Headers and Footers

We also use octet strings to describe the contents of a frame in data transmission. There are two techniques: octet counting and octet stuffing.

Octet counting means the message sender indicates how many octets are in a frame. As a result, the receiver knows how to delineate the frame accurately, which is essential to prevent errors in interpretation. For example:

Octet counting

Octet stuffing distinguishes between control characters and the actual data being sent. We add octets whenever a control character is encountered in the data.

Octet stuffing

An octet serves as an escape sequence, signaling the receiver that it isn’t a part of the data. In communication protocols, we normally use the hexadecimal value 0x1B to represent the escape character (ESC).

Both techniques ensure reliable data transmission by providing clear boundaries for interpreting frames.

3.3. MIME Attachments

Some files may appear in web applications with the header “application/octet-stream.” The content type octet-stream represents a binary file. An octet-stream is an octet string whose last octet may include padding bits.

The usual cause of this (i.e., an octet stream) is that the MIME (Multipurpose Internet Mail Extensions) file type is unknown. Knowing more about the document/application can help us fix this by adding the appropriate file extension (e.g., jpeg) before opening it.

4. Conclusion

In this article, we explained octet strings. We use this term when ambiguity may arise if the term “bytes” is used instead. More specifically, we use “octet” when referring to memory chunks of 8 bits in legacy systems where the length of a byte is platform-dependent.