Docker镜像如何反向生成 Dockerfile

1. Introduction

On a typical day, we go from Dockerfiles to images. But what if we could also do the reverse — go from images to Dockerfiles?

While there’s no official or standardized method of creating Dockerfiles from container images, there are workarounds that get us pretty close to the goal. So, in this tutorial, we’ll go over some ways to generate a Dockerfile from an image.

2. Using docker history

With docker history, we can examine the layers of a container image, gaining insight into the commands and instructions used in its parent Dockerfile. Then, with those instructions, we can recreate the image’s Dockerfile.

To illustrate the use of docker history to generate a Dockerfile from an image, we’ll follow these steps:

Create a Dockerfile
Build an image from the Dockerfile
Examine the layers of the image using docker history
Recreate the Dockerfile using information from docker history
Adjust the recreated Dockerfile if necessary

2.1. Creating a Dockerfile

In this illustration, we’ll create a Dockerfile named ParentDockerfile. Then, we’ll cat its content:

$ cat ParentDockerfile
FROM python:3.10-bullseye
EXPOSE 80

COPY static static
COPY app.py app.py
COPY index.html index.html

RUN pip install flask

ENTRYPOINT [ "/bin/bash", "-c", "flask run --debug -p 80 -h 0.0.0.0" ]

This Dockerfile configuration starts with a Python-Debian-Bullseye base image. Then, it describes the port to be opened in the container and copies three files from the working directory to the image.

Next, it specifies a build command that will install Flask and an entry point executable that runs a flask application.

2.2. Building the Image

Now that we have the Dockerfile, we’ll build our image using docker build:

$ docker build . -f ParentDockerfile -t baeldung-image
[+] Building 67.8s (10/10) FINISHED                                                                      docker:default
 => [internal] load build definition from ParentDockerfile                                                         0.1s
 => => transferring dockerfile: 235B                                                                               0.0s
 => [internal] load metadata for docker.io/library/python:3.10-bullseye                                            0.9s
...truncated...
 => => naming to docker.io/library/baeldung-image                                                                  0.0s

To confirm that we have the image, we’ll run docker images:

$ docker images
REPOSITORY       TAG       IMAGE ID       CREATED          SIZE
baeldung-image   latest    a4a6f7d78efc   15 seconds ago   923MB

At 923MB, this is definitely not an efficient image. If we were using it for deployment, the performance — particularly speed — would take a nosedive. But, since we’re only using it for an illustration, we’ll run with it.

2.3. Examining the Image Layers With docker history

To examine the layers of the image (baeldung-image), all we have to do is pass the image name as an argument to docker history:

$ docker history baeldung-image
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT
a4a6f7d78efc   12 minutes ago   ENTRYPOINT ["/bin/bash" "-c" "flask run --de…   0B        buildkit.dockerfile.v0
<missing>      12 minutes ago   RUN /bin/sh -c pip install flask # buildkit     11.7MB    buildkit.dockerfile.v0
...truncated...

The CREATED BY column contains the instructions and commands we’ll need to recreate the Dockerfile for this image. However, some of its entries are truncated, and that’s not desirable.

To fix that, we’ll add the –no-trunc option to the command:

$ docker history --no-trunc baeldung-image
IMAGE                                                                     CREATED          CREATED BY                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          SIZE      COMMENT
sha256:a4a6f7d78efc50ef485cdb2d7f9b22560fe8ce7628f896cc517fcf0e7cae8cbe   22 minutes ago   ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   


                                                                                                       0B        buildkit.dockerfile.v0
<missing>                                                                 22 minutes ago   RUN /bin/sh -c pip install flask # buildkit                                                                                                          


                                                                                                       11.7MB    buildkit.dockerfile.v0
<missing>                                                                 22 minutes ago   COPY index.html index.html # buildkit
...truncated...

As seen above, using the –no-trunc option alone may leave us with a skewed output. But we can work around this by ensuring the output only includes the columns we need. To do this, we’ll use the –format option.

Let’s try the –format option out by restricting the output to the CREATED BY column only:

$ docker history --no-trunc --format {{.CreatedBy}} baeldung-image
ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]
...truncated...
/bin/sh -c #(nop) ADD file:ff6bc341b5945acf6b9c190d70b5f5806fb3fae7b5c568ad6395aec1b95ba89c in /

As expected, this output, which includes only the CREATED BY column, is a bit more readable.

The entries in the CREATED BY column are in reverse. In other words, the last instruction in the Dockerfile comes first in the docker history output. This means that the instructions and commands for the base image will come last in the output.

Since docker history output is in reverse order of the parent Dockerfile, we’ll have to work our way from the bottom to the top when recreating the Dockerfile.

Better still, we could pipe the output to tac to reverse the order:

$ docker history --no-trunc --format {{.CreatedBy}} baeldung-image | tac
/bin/sh -c #(nop) ADD file:ff6bc341b5945acf6b9c190d70b5f5806fb3fae7b5c568ad6395aec1b95ba89c in /
...truncated...
ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]

2.4. Recreating the Dockerfile

To recreate the Dockerfile, we’ll have to go through the docker history output manually and figure out which of the instructions and commands we need in order to recreate the Dockerfile.

First, let’s get the untruncated output:

$ docker history --no-trunc --format {{.CreatedBy}} baeldung-image | tac
...truncated...
RUN /bin/sh -c set -eux;  for src in idle3 pydoc3 python3 python3-config; do   dst="$(echo "$src" | tr -d 3)";   [ -s "/usr/local/bin/$src" ];   [ ! -e "/usr/local/bin/$dst" ];   ln -svT "$src" "/usr/local/bin/$dst";  done # buildkit
ENV PYTHON_PIP_VERSION=23.0.1
ENV PYTHON_SETUPTOOLS_VERSION=65.5.1
ENV PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/dbf0c85f76fb6e1ab42aa672ffca6f0a675d9ee4/public/get-pip.py
ENV PYTHON_GET_PIP_SHA256=dfe9fd5c28dc98b5ac17979a953ea550cec37ae1b47a5116007395bfacff2ab9
RUN /bin/sh -c set -eux;   wget -O get-pip.py "$PYTHON_GET_PIP_URL";  echo "$PYTHON_GET_PIP_SHA256 *get-pip.py" | sha256sum -c -;   export PYTHONDONTWRITEBYTECODE=1;   python get-pip.py   --disable-pip-version-check   --no-cache-dir   --no-compile   "pip==$PYTHON_PIP_VERSION"   "setuptools==$PYTHON_SETUPTOOLS_VERSION"  ;  rm -f get-pip.py;   pip --version # buildkit
CMD ["python3"]
EXPOSE map[80/tcp:{}]
COPY static static # buildkit
COPY app.py app.py # buildkit
COPY index.html index.html # buildkit
RUN /bin/sh -c pip install flask # buildkit
ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]

Upon examining the output above, we found that the commands and instructions before EXPOSE map[80/tcp:{}] are for the base image. After that, the rest are the custom instructions used in the parent Dockerfile.

When generating the Dockerfile, we could try recreating the base image using its instruction in a different base image — of course, with some adjustments.

Alternatively, if we can figure out the base image from the instructions in the output, we could use that. Then again, we can choose another base image with similar (or the same set of) resources.

Thanks to the PYTHON_* ENV instructions in the output, we can almost be sure that the base image in the parent Dockerfile is a Python image. So, even if we don’t know the actual Python image, we can try working with another Python image when recreating the Dockerfile:

$ cat Dockerfile
FROM python:3.11-bookworm

After the substitute base image, we’ll add the custom instructions:

$ cat Dockerfile
FROM python:3.11-bookworm
EXPOSE map[80/tcp:{}]
COPY static static # buildkit
COPY app.py app.py # buildkit
COPY index.html index.html # buildkit
RUN /bin/sh -c pip install flask # buildkit
ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]

2.5. Adjusting the Dockerfile

While our recreated Dockerfile is almost set for use, we must make some adjustments to avoid errors. First, we need to adjust the EXPOSE instruction:

$ cat Dockerfile
FROM python:3.11-bookworm
EXPOSE 80
...truncated...

Next, we’ll remove the # buildkit comments:

$ cat Dockerfile
...truncated...
COPY static static
COPY app.py app.py
COPY index.html index.html
RUN /bin/sh -c pip install flask
...truncated...

Finally, we’ll add commas between the ENTRYPOINT commands:

$ cat Dockerfile
...truncated...
ENTRYPOINT ["/bin/bash", "-c", "flask run --debug -p 80 -h 0.0.0.0"]

Now, we have a usable Dockerfile:

$ cat Dockerfile
FROM python:3.11-bookworm
EXPOSE 80
COPY static static
COPY app.py app.py
COPY index.html index.html
RUN /bin/sh -c pip install flask
ENTRYPOINT ["/bin/bash", "-c", "flask run --debug -p 80 -h 0.0.0.0"]

So, let’s build with it:

$ docker build . -t new-image
[+] Building 93.1s (10/10) FINISHED                                                                      docker:default
 => [internal] load build definition from Dockerfile                                                               0.0s
 => => transferring dockerfile: 242B                                                                               0.0s
 => [internal] load metadata for docker.io/library/python:3.11-bookworm                                            1.6s
...truncated...
 => => naming to docker.io/library/new-image

3. Using Custom Tools

Some custom tools like dfimage and dedockify can also help when trying to convert an image to a Dockerfile. But then, we may still need to adjust the Dockerfile we create from their output before use.

To use *alpine/*dfimage, dedockify, and other similar tools, we’ll create an alias that creates, runs, and removes a container created from them:

$ alias dfimage="docker run -v /var/run/docker.sock:/var/run/docker.sock --rm alpine/dfimage"

In this instance, we’re using alpine/dfimage — one of the popular versions of dfimage.

Once that’s done, we can use dfimage on an image:

$ dfimage baeldung-image
Analyzing baeldung-image
Docker Version:
GraphDriver: overlay2
Environment Variables
...truncated...
|PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/dbf0c85f76fb6e1ab42aa672ffca6f0a675d9ee4/public/get-pip.py
|PYTHON_GET_PIP_SHA256=dfe9fd5c28dc98b5ac17979a953ea550cec37ae1b47a5116007395bfacff2ab9

Open Ports
|80

Image user
|User is root

Potential secrets:
|Found match etc/ssh/ssh_config Client SSH Config .?ssh_config[\s\S]* a005da213eb60592ae486fa7ea13764c01b91580580c00dce77ea7b91b191fd3/layer.tar
|Found match etc/ssh/ssh_config.d/ Client SSH Config .?ssh_config[\s\S]* a005da213eb60592ae486fa7ea13764c01b91580580c00dce77ea7b91b191fd3/layer.tar
Dockerfile:
CMD ["bash"]
...truncated...
ENTRYPOINT ["/bin/bash" "-c" "flask run --debug -p 80 -h 0.0.0.0"]

Typically, tools like dfimage and dedockify may offer an ordered output right off the bat. So, there would be no need for a command like tac.

4. Conclusion

In this article, we discussed methods for generating a Dockerfile from an image. We can generate Dockerfiles from images using data from docker history, dfimage, or dedockify. However, we may need to put in some extra work before we arrive at a usable Dockerfile.

Persistence

REST

Security