Taming the Dockerfile beast - Dockerventure Part 2
TL;DR Build and share your own managed docker images.
In this second part of the Dockerventure
series, we are going to focus on how we can define our own custom images with Dockerfiles
, how we can build those images, and publish them in Dockerhub
so we can share them with the rest of the world!
If you missed my intro to Docker
click here Exploring the Docker world
Docker images & Dockerfile basics
In Docker’s world, the image definition is described inside a file called Dockerfile
.
The Dockerfile describes how to assemble an environment for a container that contains all the necessary information and metadata on how to run it based on this image.
To containerize our applications, we have to write Dockerfiles that define step by step how are images are built.
Let’s look at one Dockerfile and discuss it’s building blocks. This Dockerfile defines an image that basically sets up a python flask server and shows a message with a color that we picked.
FROM python:3.7-alpine
# set env var color
ENV color=blue
# set working directory
WORKDIR /server
# copy the dependencies file to the working directory
COPY requirements.txt .
# install dependencies
RUN pip install -r requirements.txt
# copy the script file to the working directory
COPY server.py .
CMD [ "python", "./server.py" ]
The order
of the commands in a Dockerfile matters as they are executed top-down
and each one of them is considered as one layer
of our image.
For reference, this is our python file server.py
that outputs the selected color and we would like to dockerize:
from flask import Flask
import os
server = Flask(__name__)
color = os.environ['color']
@server.route("/")
def hey():
return "Hey there my favourite color is: {}!".format(color)
if __name__ == "__main__":
server.run(host='0.0.0.0')
FROM
Let’s start with the FROM
command which is required in every Dockerfile. It’s the initial image on top of which we start to build our custom one.
Can be either a minimum Linux distribution or some image configured already with the basic tools we need for our application. In our case, since we are going to run a python script, we used python:3.7-alpine
.
ENV
ENV
refers to an environment variable and is one of the ways to set them. These environment variables are used extensively to inject key/value pairs for building and running containers.
Environment variables are quite handy for these use cases as they work on every os, config, environment. In our case, we set the color
we picked.
RUN
RUN
command is used to execute shell commands inside the container while it’s building it. We use RUN
to create files, folders, install dependencies, run shell scripts, and various other tasks that we would like to run inside our container at creation time to prepare it.
To combine multiple commands with the same RUN
statement you can use &&
. This way we can include multiple commands in a single layer in our image.
WORKDIR
WORKDIR
is changing the working directory inside our container.
COPY
COPY
is used to copy files inside the container. This is useful to transfer for example executable files and various other files used by the container. In our case, we copy our python script along with its dependencies definitions.
CMD
CMD
is the final command that will be run every time we launch a new container from this image or restart a container. In our example, we are just executing our python app at runtime.
Build an image from a Dockerfile
To build our Dockerfile and produce an image that can be used to create a containerized version of our app, we need to execute the docker build
command.
The docker build command requires just a path in order to set the build context
, the directory where to build from. That is enough if we use the default name Dockerfile
for our build files.
If we choose to pick a custom Dockerfile name, we can specify this with the -f
flag. We can also specify a custom tag and a repository at which to save the new image with the -t
flag.
To build our image, we have to run this command from the directory where our Dockerfile1
lives.
docker build -t myflaskserver:0.0.1 -f Dockerfile1 .
In the above command, we specify -t
flag and we add a custom tag for our image, myflaskserver
. We can also define a numbered version after :
like 0.0.1
.
Alright if you tried the above command with our Dockerfile you should have built an image locally. Let’s verify this by executing:
docker images
There we can see our new image along with some info like REPOSITORY
TAG
IMAGE
ID
CREATED
SIZE
.
Now we are ready to use our newly created image to generate a containerized version of our python flask server. Let’s go ahead and try that.
docker run -d -p 5000:5000 myflaskserver:0.0.1
docker ps
Now try curl on localhost on port 5000 or open your browser at locahost:5000 and you should see a message like below:
curl localhost:5000
Amazing our containerized python server works as expected!
Let’s try something else. Let’s use the environment variable color
to change the message. First,let’s kill the previous container by executing docker rm -f <docker-container-id>
. Then:
docker run -d -p 5000:5000 -e color=green myflaskserver:0.0.1
And again if you try curl or go to your browser at localhost:5000
you should see the color has changed:
Publish a custom Docker image
Alright, so far we have our custom image and verified that it works as expected.
Next, we would like to store this image on Dockerhub, so we can share it with other people.
To do that we’ll need a Dockerhub
account so if you don’t have one go ahead and register here. After you created your account, you can log in from the command line with docker login
.
In order to publish our image to our account on Dockerhub, we must tag it using with the docker account id
like that:
docker build <Your Docker ID>/myflaskserver:0.0.1
For example, my Docker ID is moustakis
:
docker build -t moustakis/myflaskserver:0.0.1 -f Dockerfile1 .
And then we can simply run:
docker push moustakis/myflaskserver:0.0.1
This will automatically create a public repository for our image. Go ahead and explore your newly published image here. Congrats on your first published image!
Dockerfile Best Practises
As the last step let’s examine some best practices on creating Dockerfiles and try to improve our previous Dockerfile a bit.
Instruction order matters
To leverage efficiently the build cache mechanism place the instructions that tend to change more frequently after the ones that change less often.
For example, taken from our previous Dockerfile, our dependencies tend to change less frequently than our code. So we decided to first copy the requirements.txt
, install them and then copy the server.py
.
Notice that we could copy both the requirements.txt
and server.py
using the same layer but we chose not to leverage the caching mechanism on the less often changed requirements.txt
.
# copy the dependencies file to the working directory
COPY requirements.txt .
# install dependencies
RUN pip install -r requirements.txt
# copy the script file to the working directory
COPY server.py .
Ephemeral containers
Keep in mind that containers by nature are destroyed and replaced all the time, so the containers that we create with Dockerfiles should be able to terminate and rebuilt.
Use .dockerignore
Similar to .gitignore
, used to exclude files that aren’t relevant to the build.
Multi-stage builds FTW
By including files or packages that aren’t necessary in our final image we end up with larger build times and larger image sizes.
That results in more time to build, push, and pull our images along with some other disadvantages like increasing the possible attack surface by some malicious outsider.
To avoid this, we can create different stages in our Dockerfile, for example:
FROM python:3.7-alpine as base
ENV color=blue
FROM base as builder
RUN mkdir /install
WORKDIR /install
# copy the dependencies file to the working directory
COPY requirements.txt /requirements.txt
# install dependencies
RUN pip install --prefix=/install -r /requirements.txt
FROM base
COPY --from=builder /install /usr/local
# set the working directory in the container
WORKDIR /server
COPY server.py ./
ENTRYPOINT [ "python" ]
CMD ["./server.py" ]
In the above Dockerfile we use the builder
stage to install build our dependencies but strip them from our final application image.
The first stage is only used for building dependencies and we copy to our final stage only the necessary files produced from the first stage. To define a stage, we use the as
argument to the FROM
command as shown.
Go ahead and build this image and then run docker images
and compare the size of the two images, the second one should be smaller:
docker build -t <Your Docker ID>/myflaskserver2:0.0.1 -f Dockerfile2 .
docker images
Install only the necessary packages
It’s never a good idea to install more stuff than you need, keep it simple and add only completely necessary packages otherwise you’ll find yourself fighting to maintain dependencies.
One container for one job
Each of your images should be defined to execute one specific job. The Docker architecture favors decoupling applications as much as possible so try to separate your container’s responsibilities as much as possible.
This way you can achieve horizontal scaling, reusability, easier maintenance, and faster development lifecycles.
Other useful Dockerfile instructions
LABEL
Labels can be added to the image for various reasons like organization, automation, versioning, etc. Use the LABEL
command followed by a key-value
pair:
LABEL version="0.0.1"
EXPOSE
Defines the ports on which the container listens for connections to be used.
ENTRYPOINT
Defines the default command to run in a container if specified. To overwrite, we need to specify the command --entrypoint
at runtime.
If only the CMD
instruction exists then the CMD
is executed and can be overwritten by placing at the end of the docker run
another command.
If an ENTRYPOINT
is defined in a dockerfile then the value of CMD
e.g. CMD ["5"]
will be passed as a parameter to the value of ENTRYPOINT
ENTRYPOINT ["sleep]
. In our second Dockerfile
example we modified the initial
CMD [ "python", "./server.py" ]
to:
ENTRYPOINT [ "python" ]
CMD ["./server.py" ]
This way we specify that python
is the image’s main command and the "./server.py"
is the default flag passed to the entrypoint.
For example, if we would just like to check the python version, we could specify the --version
after our docker run command which will replace the execution of our python file "./server.py"
.
VOLUME
Used to define some persistent volumes for any mutable parts of the image. Anything that isn’t considered ephemeral and should be persisted further the container’s lifecycle.
Summary
That concludes the second part of our Dockerventure. Hope you enjoyed our deep dive into Dockerfiles
as much as I did.
We explained and analyzed some of the fundamental and most used Dockerfile instructions
, wrote a simple Dockerfile for our python server, and learned how to build and push our custom images to Dockerhub
.
In the end, we saw some best practices
around Dockerfiles and tried to implement some of these to improve our first Dockerfile. Here you can find the next episode on Useful Docker commands.