1. Overview

Kubernetes allows users to control the restart of containers inside a Pod by using restart policies. Restart policies are a self-healing feature for containers. They specify the condition under which a container inside a Pod will be automatically restarted. For example, we might need to restart a container only if it exits because of an error, or we might need to restart the container even if it exits after successful completion. Depending on the function that a container executes, we decide upon which restart policy we should use.

In this tutorial, we’re going to cover the difference between two common types of restart policies, which are Always and OnFailure.

2. The Always Restart Policy

The Always restart policy is the default policy in Kubernetes. So if we create a Pod without setting the restartPolicy field, then Kubernetes will automatically set it to Always.

By using this policy, containers will restart when they terminate, regardless of their exit status. In other words, this policy doesn’t care why the container exited, whether it is a successful completion or an error. It will always make sure that the container is running.

This policy is useful for containers with applications that need to be always running like web servers. So the policy can ensure these containers are always available.

Let’s check this with an example:

$ cat always-up.pod 
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: always-up
  name: always-up
spec:
  containers:
  - image: alpine
    name: always-up
    command: ["sh","-c","sleep 20"]
  restartPolicy: Always

Here, we’ve created a manifest for our Pod. We’ve used the alpine image and set sleep 20 as our container command. This means that this container will start by executing the sleep command for 20 seconds, and then it will exit with success. However, we’ve set our restartPolicy to Always, so our container will keep restarting again after it exits.

Let’s deploy our Pod to the cluster:

$ kubectl apply -f always-up.pod 
pod/always-up created

Now that we’ve created our Pod, let’s check its status:

$ kubectl get po
NAME        READY   STATUS    RESTARTS   AGE
always-up   1/1     Running   0          7s

We can see here our Pod is running successfully. Let’s check its status again after 20 seconds:

$ kubectl get po
NAME        READY   STATUS    RESTARTS     AGE
always-up   1/1     Running   1 (4s ago)   25s

Now we can see that our Pod has one restart because our Always policy restarted the container after the sleep command completed successfully.

We can also verify the exit status of our container by using the describe command:

$  kubectl describe po always-up
------ OUTPUT TRIMMED -----
Containers:
  always-up:
    Container ID:  containerd://8c73f00b9131384bc549dd1e5bcd26eb0887d7b188e0a055df4d767368a07877
    Image:         alpine
------ OUTPUT TRIMMED -----
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0

Here, we can see that our container has completed successfully with Exit Code 0.

3. The OnFailure Restart Policy

The OnFailure restart policy will restart a container only if the container process exits with an error. This means that the exit code of the container will determine the decision that the policy will take. We can use this policy for containers that we need to run successfully until completion and then stop.

Let’s check this with an example:

$ cat on-failure.pod 
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: on-failure
  name: on-failure
spec:
  containers:
  - image: alpine
    name: on-failure
    command: ["sh","-c","sleep 20"]
  restartPolicy: OnFailure

Here, we’ve created our Pod manifest with the same configuration, but we’ve only changed the restartPolicy to OnFailure.

Now let’s create our Pod:

$ kubectl apply -f on-failure.pod 
pod/on-failure created

So here, our Pod was created successfully. Let’s check its status:

$ kubectl get po
NAME         READY   STATUS    RESTARTS   AGE
on-failure   1/1     Running   0          6s

We can see our Pod is running with no issues. Let’s check the status again after 20 seconds:

$ kubectl get po
NAME         READY   STATUS      RESTARTS   AGE
on-failure   0/1     Completed   0          94s

The status now is Completed, but we don’t see any restarts happening to the container because our policy will only restart the Pod if it exits with a failure.

Let’s verify the exit code of the container:

$  kubectl describe po on-failure
----- OUTPUT TRIMMED -----
Containers:
  on-failure:
----- OUTPUT TRIMMED -----
    State:          Terminated
      Reason:       Completed
      Exit Code:    0

We can see that our container has an Exit Code 0, which indicates that it has completed successfully.

Let’s now apply a misconfiguration to our container’s command to make it exit with an error:

$ cat on-failure.pod 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: on-failure
  name: on-failure
spec:
  containers:
  - image: alpine
    name: on-failure
    command: ["sh","-c","sleeeep 20"]
  restartPolicy: OnFailure

Here, we’ve changed our command to a value that should fail. Let’s create our Pod with this configuration:

$ kubectl apply -f on-failure.pod 
pod/on-failure created

Now let’s check our Pod status:

$ kubectl get po
NAME         READY   STATUS             RESTARTS     AGE
on-failure   0/1     CrashLoopBackOff   1 (3s ago)   6s

We can see that our Pod has one restart because the container exited with an error code, which triggered the OnFailure policy to restart it.

Let’s verify our container exit code:

$ kubectl describe po on-failure
----- OUTPUT TRIMMED -----
Containers:
  on-failure:
----- OUTPUT TRIMMED -----
    Last State:     Terminated
      Reason:       Error
      Exit Code:    127

Because the Exit Code here was non-zero, the restart policy detected the failure and restarted the container.

4. Conclusion

In this article, we’ve covered the basics of Kubernetes restart policies and compared two of the most common policies, which are Always and OnFailure. Kubernetes restart policies specify under which condition should a container inside a Pod be restarted. It enables users to configure a self-healing mechanism for containers depending on the function of the container.

The Always restart policy will restart a container whenever it is terminated, regardless of the exit status. This ensures that the container keeps running all the time. On the other hand, the OnFailure restart policy will only restart a container if it exits with an error. This is useful in situations when we need our container to perform a specific task and then terminate if the task is completed successfully.