Automatic Continuous Deployment of Docker containers

This article explains how to achieve Continuous Deployment of Docker-based software, using either pull-based approaches (external tools such as watchtower and harbormaster), or push-based techniques (deployment from the CI/CD pipeline). I explain the advantages and disadvantages of each approach, and also illustrate how automated testing greatly reduces the risk of unnoticed failed deployments.

Table Of Contents

Introduction
Pull-based approaches
Push-based approaches (CI Ops)
Push- vs. pull-based approach
System testing
Conclusion

Introduction

When you use Docker (not Kubernetes) in production, doing Continuous Deployment (CD), that is, pushing changes to production (or other environments), is challenging. While Kubernetes has fancy GitOps controllers, such as ArgoCD or FluxCD, which are excellent CD tools, it seems that the Docker ecosystem lacks such mechanisms. Still, you need a mechanism to automatically update your deployment, whenever your images have been updated, or when configuration values have changed. At the same time, you want to avoid that a fully-automated deployment process breaks your production system, without you noticing it.

Any manual solution, where you connect to the production system (e.g. with SSH) and update deployment manifest files (such as docker-compose.yml), or using Portainer‘s web interface to update images tags, is not really sustainable. It involves manuall effort on every rollout, which can seriously decrease the deployment frequency, e.g. because it is an annoying task that teams keep procrastinating, or because of the process-overhead of asking the operations team to do the deployment.

Let’s take a look at different approaches to automate the deployment with Docker.

Pull-based approaches

A pull-based approach is one where you install an additional tool on the Docker server, which continuously pulls and applies changes from a different system, such as a Git repository, or a Docker image registry. Two prominent implementations are:

watchtower: this tool regularly checks for newer versions of the images of your running Docker containers in the corresponding image registry. If it finds a newer image, it pulls it, and restarts the corresponding container, using the same settings that were used when originally starting it. For this to work, you need to reference stable image tags for your containers, e.g. postgres:latest or postgres:12. In other words, watchtower makes little sense when you use very strict version pinning (e.g. postgres:12.2), because watchtower does not change the tag itself. You can configure a lot of settings, such as limiting which images watchtower should watch, or whether it should send notifications (e.g. to MS Teams or via email) whenever it updated a container.
harbormaster: this Python tool reads a YAML config file (created by you) stored on the server. This YAML file tells Harbormaster to regularly (every few minutes) pull a specific set of one or more Git repositories, which are expected to contain a docker-compose.yml file. Harbormaster then (re) starts the referenced services of the docker-compose.yml files, in case they changed. Harbormaster requires you to pin the image tag versions in the docker-compose.yml file, as only changes in the versions will actually lead to a redeployment. In essence, harbormaster is like a GitOps Kubernetes controller, but for Docker.
What’s Up Docker: watches Docker registries for updated images and then activates “triggers” that you define. Many different trigger implementations are available, e.g. updating the docker-compose.yml file, sending notifications via email/messenger/sms, or invoking web hooks.

There is one caveat to pull-based systems, though: you need to avoid that a broken deployment-update goes unnoticed. This could easily happen, because the pull-based system is a separate / isolated system (and if errors happen there, you don’t automatically get notified). To achieve this, you additionally have to set up some monitoring mechanism for your containers, such as the Prometheus-stack, which I discussed in this article, and configure it to notify you in case of errors.

Push-based approaches (CI Ops)

A push-based approach pushes changes via a CI/CD pipeline from a Git repository to your production system. As is common when using GitOps and CI Ops, the idea is that you store a set of configuration files in Git which declare the state of the production system. This configuration file could be, for instance, a docker-compose.yml file, which contains the concrete version tags of the images used by each service. You can find out more details about GitOps and CI Ops in my GitOps introduction article.

On a technical level, you need to establish a reliable method for the CI/CD job to reach your production environment. Two common methods are:

SSH: if the runners / agents (or whatever they are called by your CI/CD system) have direct (TCP) connectivity to the production environment server(s), they can log in via SSH and run the deployment-update commands on these servers. This requires that the servers open the SSH port (possibly to the Internet), a dedicated (Linux) user account for the CD job, and that you store the SSH keys somewhere in the CI/CD system (e.g. as CI/CD variables when using GitLab CI or GitHub Actions).
Install a runner on the production server: this works only if you use a CI/CD system that has a distributed architecture, where the job management system/server is separate from the machines or containers that actually run the job. Examples for such distributed systems are GitHub Actions, or GitLab CI/CD. These systems allow you to run self-hosted runners, which you install yourself, and which you configure to connect to the GitLab/GitHub server. Compared to SSH, this approach is more flexible with respect to the connectivity, and it does not require you to store secrets (such as SSH keys) in your CI/CD system.

For Docker, the sequence that updates your deployment can be as simple as this:

(Only for the SSH-based approach) Copy the updated docker-compose.yml from the Git repo to the server
Run docker-compose pull && docker-compose up -d to restart the containers

Push- vs. pull-based approach

The following table highlights the key differences between the two variants of the push-based approach vs. the pull-based approach:

	Push-based (SSH)	Push-based (Runner)	Pull-based
Operational complexity	Create user account and SSH keys, register them with the CI/CD system	Install runner, register it with the CI/CD system, keep it updated regularly	Install tool (watchtower, harbormaster) and configure it; install monitoring stack and keep it updated regularly
*Guaranteed continuous synchronization of declared* vs. actual deployment state**	❌	❌	✅
Observability	Limited: only at (re-)deployment time, done by the CI/CD or SCM system (e.g. GitLab-generated mails for failed CD jobs)	See SSH	Full observability (at deploy- and run-time), assuming a separate monitoring stack
Persistent deployment log, stored for the deployment jobs	✅ (CD job log)	✅ (CD job log)	❌ (only non-persistent logs of the watchtower/harbormor master container exist)
Production system credentials are hidden from the CI/CD system	❌	✅	✅

When it comes to the operational complexity (considering the number of systems you need to maintain), the push-based (SSH) approach is the clear winner, because you don’t need to manage any additional systems. I’m assuming that SSH is already enabled on the production server anyway, and that you have a CI/CD system in place. In contrast, the pull-based approach is the clear loser: you need to maintain two additional systems: the tool that pulls (e.g. watchtower), and the monitoring stack.

Regarding state synchronization: the pull-based approach (using harbormaster), is the only one that can guarantee that the system state declared in Git (in the docker-compose.yml file) always matches the actually deployed state – none of the other approaches do this. The pull-based system is also the only one that can achieve full observability that also covers the system while it is running (not just at deployment time). Of course, if you instead choose a push-based approach, you can additionally install a monitoring stack to achieve full observability as well.

Having persistent deployment logs is useful for auditing purposes, and can be helpful to diagnose issues in production. Push-based approaches are better in this respect, because the CI/CD system keeps logs for the CI/CD pipelines anyway, including the relationship between Git commits and the pipeline instances. Pull-based tools are much more limited here, as they only offer logs of the container of the tool (e.g. harbmaster).

Finally, you want to limit access to your production system to as few people as possible. The pull-based approach offers the best protection, because only few (operations) people have access to the servers, and they configure credentials to Git or the image registry – none of the developers get access. The runner-based push-approach is also quite good: it does not leak any credentials to the SCM / CI/CD system (and therefore, to developers). However, it does not protect you completely: a developer could update the CD job to do “rm -rf /“, destroying your production environment. However, this will be stored in the CI/CD pipeline (and Git commit) log, which lets you track down the responsible developer, leading to a serious conversation with them…

System testing

Having automated tests at the system-level is generally a good idea, to improve your system’s reliability. At the very least, you should have a smoke test, which only starts the containers of your application, verifies that none of the containers immediately crashed, and performs a few simple HTTP/TCP requests to them.

If you have tests, you should run them in your CI/CD pipeline before triggering the deployment job (or pushing the images to the registry using the production-level tags). Having successfully run smoke- and system-level tests greatly reduces the risk for the deployment to fail. If you use the pull-based approach (where you are, otherwise, “blind” w.r.t. deployment errors) you can (at least initially) skip monitoring of the deployment process.

On the implementation level, there are many ways to run such system or smoke tests in a pipeline. In a nutshell, I highly recommend you use a proper tool for running these tests, such as testcontainers, over creating low-level Bash scripts. With “Bash scripts”, you need to spend a lot of effort for things like:

Detecting when the containers under test are actually fully up and ready to accept traffic. For instance, containers often run some initialization steps before they become ready and open their ports.
Reliably cleaning up containers (or compose stacks) at the end of the test execution, and avoiding that the test hangs indefinitely.
Network connectivity: how can the test code reach the services running in the containers? This highly depends on the CI/CD platform and the way the Docker socket is made available to the CI test job.
- For instance, if you use a mounted Daemon socket (vs. using Docker-in-Docker), the commands docker run (or docker-compose up) will spawn sibling containers, which makes connecting to them more involved.
- Another alternative (for compose stacks) would be to have a dedicated test container defined as service in your docker-compose.yml file, which makes connectivity easier. You would let this container write out the test results via volume-mapped files. You would start the stack with docker-compose up --abort-on-container-exit, so that Docker shuts down the entire stack once the test container (or any other container) has finished. However, this approach requires you to build a dedicated testing container.
Avoid port clashes if you run test jobs in parallel.
Getting the test result in a format that a developer can consume without efforts, in case of failed tests.

To solve these problems all at once, take a look at testcontainers. testcontainers is a solution that lets you start and stop containers (or compose stacks) from testing frameworks, with support for Java, JavaScript, Scala, Python, Go, and Rust (see here). testcontainers is a library (ported to all these programming languages) that wraps calls to the Docker socket. It offers a high-level interface to control all kinds of container-related aspects, such as volume mounts, logs, or networking (ports). It ensures that containers are safely shut down (no matter whether your tests failed or succeeded) and it also solves connectivity issues, or waiting for a container to be ready.

Again, note that testcontainers is not a container itself, nor is it shipped as one, nor does it orchestrate the tests. It is just a “normal” library that you need to install into your project, like you would install other testing-related libraries such as JUnit, pytest, etc. You use it in your “normal” Java/Python/Go/Scala/… testing code (as if you were writing unit tests), which you start using your typical test runners, such as JUnit or pytest.

Conclusion

In this article, I presented both push- and pull-based solutions that offer automatic Continuous Deployment of Docker containers, and contrasted how they are different. Which approach you choose is up to you, and depends on your requirements. Unless you are being audited and need to actually prove the deployments to an outside party, continuous synchronization (offered only by the pull-based approach) is not necessary. In this case, a push-based approach is favorable, because it is easier to implement, and has many other advantages.

Irrespective of the approach, it is highly recommended that you create and run tests (at least smoke tests), and that you implement full observability by installing a monitoring stack, such as the Prometheus stack.