ARM, docker

Building Docker containers in 2019

It has been a few years now from the container explosion and the ecosystem is starting to settle down. After all the big names in industry jumped in the containerization wagon, very often proposing their own solution, it seems like Docker based platforms are finally here to stay.

After Docker became the de facto standard, many things have evolved, most often under the hood. The Open Container Iniciative (OCI) was created with the objective of standardizing run times and image formats and Docker changed its internal plumbing to accommodate, such as implementing runC, and containerd-shim.

Now, after years of Docker freezing its user facing APIs we are starting to see movement again on this front, and many of the common practices we are used to seeing when building and running containers might have better alternatives. Other things have just been deprecated or fallen out of favor.

Buildkit is here

Now Dockerfiles can be built with the Buildkit backend. This backend is smarter than the classic one, and is able to speed up the build process by achieving smarter caching, parallelizing build steps and also it accepting new Dockerfile options. To quote the official documentation

Starting with version 18.09, Docker supports a new backend for executing your builds that is provided by the moby/buildkit project. The BuildKit backend provides many benefits compared to the old implementation. For example, BuildKit can: Detect and skip executing unused build stages Parallelize building independent build stages Incrementally transfer only the changed files in your build context between builds Detect and skip transferring unused files in your build context Use external Dockerfile implementations with many new features Avoid side-effects with rest of the API (intermediate images and containers) Prioritize your build cache for automatic pruning.

In order to select this backend, we export the environment variable DOCKER_BUILDKIT=1. If you miss the detailed output, just add --progress=plain.

Use multi-stage builds

We are used to seeing a lot of trickery to try to keep each layer size to a minimum. Very often we use a RUN statement to download the source and build an artifact, for instance a .deb file or to compile a binary, and try to clean it up in the same statement to keep the layer tidy.

This both wastes time each build and is ugly for what it does. It is better to use multi-stage builds.

In some cases, in order to avoid this I have seen people adding a binary blob in the git repository so it can be copied. Much better to use this approach.

Use –squash

While it doesn’t always makes sense for every Dockerfile, we can often squash our first layer and achieve a more maintainable Dockerfile.

--squash is an experimental argument that needs to be enabled like we explained here. Instead of

, we can do

Use one Dockerfile for all architectures

We talked about a technique to use qemu-user-static for creating builds for different architectures in a past article. When building different versions of the same image for different architectures, it is very common to see three exact copies of the same Dockerfile where the only thing that changes is the FROM line.

Just have one single file like so

, and build with

Use multi-arch manifests

So we now have three builds with one Dockerfile, and we tag them like this

That’s all good, but we can now simplify the instructions the user has to type by creating a multi-arch manifest.

This is still an experimental CLI feature, so we need to export DOCKER_CLI_EXPERIMENTAL=enabled to be able to access it.

Now your users of any architecture only have to do

, and they will receive the correct version of the image.

Add a HEALTHCHECK

Even if we run with --restart=unless-stopped, the only clue Docker or Docker Swarm to know that things are OK is that the container has not crashed. If it is unresponsive or returning errors it won’t be restarted properly.

It is more robust to add a HEALTHCHECK statement to the Dockerfile

Don’t run as root

Even though containers are theoretically isolated, it is not good security practice to run processes as root inside them in the same way as you don’t run your web server as root.

Towards the end of your build you should add something like

Also, if possible avoid relying on sudo and if you don’t control the Dockerfile at least run in a different user namespace with docker run -u nonroot:nonroot.

Don’t forget to build with –pull

Use docker build --pull in your scripts so you are always on the latest base image.

Don’t use MAINTAINER

MAINTAINER is deprecated. Instead of

, use LABELs instead so they can be inspected just like any other metadata.

Avoid ENV where possible

ENV variables remain in the container at run time and pollute its own environment. Use ARGs instead.

Build with cache mount

Speed up your builds by providing a cache for your package manager, ccache, git and so on. This needs to be enabled with DOCKER_CLI_EXPERIMENTAL=enabled

Use SSH agent

If you require SSH credentials for your build don’t copy ~/.ssh because it will stay in the layer even if you remove it later.
Set up SSH agent, and use the experimental feature for ssh mounts

Use build secrets

If you need sensitive files that should not be public for your build, use secrets. This way, those files will only be visible to that RUN command during its execution and its contents will disappear without a trace from all layers after that.

Read the official recommendations again

While I pointed out what for me are the biggest offenders, it is always good to go back and review the officially recommended best practices.

Most of them we are familiar with: write small containers, rearrange layers to take advantage of build cache, don’t add unnecessary packages, but it’s not uncommon to go back to it and discover something we might have missed before.

Author: nachoparker

Humbly sharing things that I find useful [ github dockerhub ]

Leave a Reply

Your email address will not be published. Required fields are marked *