1. Overview
In this tutorial, we’ll discuss the differences between pipes and sockets. We’ll also give some information about Inter-Process Communication (IPC) mechanisms and why we need them in the first place.
2. IPC Mechanisms
Since processes don’t share the same address space, stack and registers, we need to use IPC methods to make the processes cooperate. There is more than one way to communicate between processes. While pipes and sockets are some ways to give us this opportunity to communicate between processes, shared memory is also one of the other approaches to doing the same kind of things.
All of these mechanisms have their own unique features and we need each of them to accomplish different kinds of tasks. While shared memory is a good solution, when one process needs to communicate with another process that works on a different machine, we may need other solutions like pipes and sockets. Let’s talk about them in detail in the next sections.
3. Pipes
Pipes are one of the most widely used IPC methods. As we can understand from its name, it is a channel with two ends. When we use it, it actually uses a piece of kernel memory. The system call pipe always creates a pipe and associated file descriptors, fd[0] for reading from the pipe and fd[1] for writing to the pipe. It takes parameters of file descriptors as an array. We can see the figure below how we create a pipe two communicate between processes:
3.1. How Do Pipes Work?
We always use a pipe with the system call fork() that creates a new process. As we guess, there is no point to use pipes when we have only one process. The figure below represents how we can have a two-way pipe between the parent and child process when we don’t close unnecessary file descriptors:
When we close the unused file descriptors we will have a figure like the below. As we can see from the correct version in the figure below both the parent process and child process can read and write to pipes when we use pipe and fork. However, since the pipe is unidirectional we should be careful if we want the communicate unidirectionally. That means both parent and child can send data to each other. In that case, one pipe wouldn’t work and that’s why we would need two pipes. One pipe for data flow from parent to child, and one pipe from data flow from child to parent. We should also close the unneeded pipe descriptors:
As we’ve said pipes are a more suitable IPC method for related processes. Because communication should be simple enough to use raw binary bytes. Actually, pipes that we use in a shell script are the best application of pipes. What they do is that they basically execute binary programs. So, the limitation of pipe is obvious that we can apply it only to related processes and we can have one-to-one communication.
For more advanced IPC, there are of course some other ways like shared memory, message queue, and sockets.
4. Sockets
Sockets have a significant role in today’s internet. The term socket is first coined in RFC 147 in 1971 when it was used in the ARPANET. It is a unique identification to or from which information is transmitted in the network. Today’s modern implementations of sockets come from the Berkeley sockets. Sockets are directly related to the operating systems and processes and we can understand this situation from the Berkeley sockets application programming interface (API) in the Berkeley Software Distribution (BSD) which originated from Unix OS.
The network protocol stack’s API establishes a connection for each socket generated by an application which is a socket descriptor. It is like a file descriptor in Unix-like operating systems. The process saves it for use with read and write operations on the channel.
4.1. How Do Sockets Work?
A network socket is bound to a combination of a kind of network protocol to be used for transmissions. This combination includes the host’s network address and a port number. Ports are numbered resources on the node that indicate a different sort of software structure. They identify the service types for processes and act as an externally accessible location component, allowing other hosts to connect to them. We can use network sockets to establish a permanent connection between two nodes or to engage connectionless and multicast communications.
To sum up, with sockets we can establish a connection between processes that runs even on different machines. The socket API supports send and recv operations that allow processes to share message buffers in and out of the kernel-level communication buffer.
The socket call allows us to create a kernel-level socket buffer. Also, it associates any kernel-level processing that needs to be associated with the socket along with the actual message movement. As we’ve mentioned, when we use sockets to establish communication, it can happen between processes on different machines.
5. Differences Between Pipes and Sockets
We’ve explained pipes and sockets and tried to give intuition about how they work. As we’ve seen, they have quite different roles when we try to establish a connection between processes. It really depends on the situation and the problem which one is more suitable to use. However, we can underline some of the differences between them:
- While communication in pipes is uni-directional, in sockets communication, it is bidirectional.
- In order to establish communication between processes with pipes, processes should be related to each other. They should have a relationship like a parent and a child process. However, we don’t have such a restriction for sockets.
- The other important difference is that we can use pipes to connect processes on the same physical machine. On the other side, we use sockets to establish connections between processes on different physical machines. That’s why they are one of the fundamental concepts in network systems.
- There isn’t any concept of packaging in pipes. Sockets can have packages through communication using IPv4 or IPv6. While sockets can divide the big size of data into smaller chunks and send it in that way, pipes aren’t able to do that.
6. Conclusion
In this tutorial, we’ve briefly explained why we need IPC mechanisms in the first place, and then shared the definitions of the pipe and socket which are the IPC mechanisms that fall in the category of message passing methods. After that, we’ve pointed out some of the differences between pipes and sockets.
Even though we discuss these IPC mechanisms in the scope of Unix-like operating systems, it is also possible to use some of these mechanisms in Windows as well.