Operating a self-hosted GitLab runner with Docker

In this article I explain how (and why) you install and use a Linux-based self-hosted GitLab CI/CD runner that executes jobs of your GitLab pipelines. I go into a few caveats and how you can reduce maintenance efforts for the runner to a minimum.

Introduction

GitLab CI/CD has a distributed architecture that consists of a GitLab server and one or more runners. The server renders the web interface and manages the CI/CD pipelines, distributing the different jobs of the pipelines to the (shared) runners. The server could be the SaaS offering (gitlab.com), or your company’s self-hosted instance. The GitLab runner is essentially just a CLI tool, running on some dedicated machine, that registers with the server, and keeps a permanently-opened HTTP connection to the server, waiting for jobs that it should execute. You can find more details about how GitLab’s CI/CD system works in this article.

While there are often shared runners you can use (e.g. on GitLab.com), it make sense that operate your own runners, for reasons like:

  • There are no shared runners available at all, so you must set up your own runners anyway,
  • The available shared runners (e.g. the one from GitLab.com, or offered by your IT department) do not meet your requirements. They might be too expensive, or they might be lacking technical prerequisites. For instance, they might be unable to access protected resources located on your company’s intranet, or they might lack a Docker socket, which you need in your CI jobs. Reasons for using a Docker socket include:
    • Building Docker images (although there are alternative tools to do this without a Docker socket, e.g. kaniko),
    • Ability to run containers, e.g. for integration- or system-level tests,
    • Ability to run Docker-based tools such as Earthly or docker-slim.

This article describes how you can set up your own GitLab runner on Linux, which uses the Docker executor (docs) to run CI jobs as isolated containers. I illustrate how to make the host’s Docker daemon socket available to the CI job container (which is optional). Finally, I explain some caveats and how to reduce maintenance efforts for the runner to a minimum.

Security aspects of sharing the Docker daemon socket

Many people say that it is bad practice to share the host’s Docker daemon socket with any containers, because then these containers can do basically anything on the host, given that the Daemon socket runs as root (background). While this is generally true, the security implications really are context-dependent. In this article I’m assuming that you are setting up self-hosted GitLab runners for a limited circle of employees of your organization, who all act responsibly, so they won’t attempt to attack your runner infrastructure.

Hardware considerations

When it comes to hardware, both a virtual machine or a real (bare-metal) host with some Linux distro are fine. They could e.g. be located in a data center, or you can rent cloud machines. Performance-wise, it does not make a huge difference whether you use VMs or real hosts. The advantage of VMs is that you can scale them up/down (regarding CPU/RAM) easily, if you find that your pipeline needs more (or less) resources. Regarding the Linux distro, you should choose one that is officially supported by Docker to install the Docker engine.

The CPU, RAM and disk requirements depend heavily on the nature of the jobs you run in CI/CD (and the frequency of pipeline runs). You will have to try things out, and adjust the specs over time. If you have a lot of temporarily cached build artefacts (such as the Docker build cache) and want fast pipelines (achieved via GitLab’s caching and Docker’s build cache), you should provision a large disk (at least 128 GB), to avoid that you need to purge the build caches as often. Because GitLab’s local caching (which is the default) will always have a better performance than distributed caching (where cache files are stored remotely in S3, see docs), having fewer, but more powerful GitLab runner machines is better than having many smaller, weak machines. The flip side is that the more powerful machines are also more expensive, since you are wasting resources e.g. at night when they are barely in use.

Often, a part of the CI job is to download and upload GitLab CI/CD artefacts, or push & pull Docker images. Therefore, you should make sure that the runners have a good network link to the respective servers, such as:

  • The GitLab server, to download & upload artefacts or Docker images (in case you use the GitLab image registry)
  • Other image registries, or a pull-through image cache (details) that you might use
  • The distributed cache storage (S3 server), if you use GitLab’s distributed caching (docs)

Software installation

Once you have a “naked” Linux OS running on your hardware, you need to install the Docker engine and the GitLab runner.

Installation of the Docker engine

To install Docker, go to the official docs and follow the instructions for the Linux distribution you are using. This installs Docker using the respective Linux distro’s package manager. Once done, make sure you also follow the Linux post-installation steps documented here, to avoid that you need sudo to run Docker CLI commands.

Before continuing with the other steps described below, first configure the Docker daemon appropriately, by editing the /etc/docker/daemon.json file. I’ve discussed useful hints in a previous article here. Most notably, you should limit the disk consumption of container logs. Once you updated the daemon.json file, restart the Docker daemon to apply the changes, e.g. via sudo systemctl restart docker (when using systemd).

Installation of the GitLab runner

There are two easy approaches to install the GitLab runner: installing it via the Linux package manager (docs), or running the runner itself as Docker container (docs). Which option you use is completely up to you. GitLab usually pushes updates of the runner to the different Linux repositories (and, as image, to Docker Hub) without delay. When installing it via the package manager, you just need to double-check that your chosen Linux distro is supported by the GitLab runner!

The official docs I just linked above are doing an excellent job describing the installation, so I won’t repeat them here. Here are a few hints / caveats:

  • The process of configuring unattended updates of the runner (see section below for details) differs, depending on whether you use the runner’s Docker image, or install it using the Linux package manager. However, the incurred work effort for unattended runner updates is very similar for both installation types.
  • If you use the Docker image, consider mounting the configuration file (config.toml) from the host (docs) rather than a Docker volume, because this makes it easier to edit the configuration file right on the host.
  • After installing the GitLab runner, consider changing some of the global (that is, repository-independent) configuration options (docs), e.g. the concurrent or check_interval setting.

GitLab runner registration

Registering a runner means that you tell it to connect to a particular GitLab (API) server and continuously poll for jobs for a specific GitLab project or group, or even from any group or project, meaning that it is a shared runner. You can run the registration command for a single GitLab runner Linux process multiple times, which results in a dedicated [[runners]] section in the config.toml file for each registration. As will become clear in a second, the runner’s Linux process only listens for jobs, but does not actually execute them – this is outsourced to the executor.

The registration command (docs) needs a few pieces of information, which it will either query from you interactively in the shell, or you use the non-interactive command (docs) to provide all details at once, without any prompts. The minimal set of information needed are:

  • Server URL and Registration token, which you obtain from the GitLab web interface. The Registration token can be obtained either per project, per group, or per server (in the latter case, the GitLab runner is a shared runner). If you register the runner per group, it will accept jobs from any of the group’s sub-projects. The steps to obtain the Registration token from the GitLab web interface are documented here.
  • The Executor, which dictates what the GitLab runner does when receiving a job. There are many executors to choose from (docs), and here we use the docker executor, which spins up a separate Docker container for each job. But there are many other executors, that e.g. spin up a new virtual machine for each job, or schedule a job as pod on Kubernetes.
  • A comma-separated list of tags: tags are simple strings. You use these tags in the .gitlab-ci.yml file to enforce that a job is run by that specific executor.
  • The default Docker image which will be used by Docker- or Kubernetes-based executors that run a job for which you have not specified an image in the .gitlab-ci.yml file.
  • A descriptive name for the executor (optional): this helps you keep multiple executors apart, if you assigned the same tags to multiple executors. A concrete example could be when you installed the GitLab runner onto two machines, so you could provide the machine’s host name as descriptive name. The GitLab web interface shows the descriptive executor name at the top of a CI job log.

There are many other optional configuration options, most of which have reasonable default values that you usually won’t need to touch. You can see them (including their documentation and default value) by running gitlab-runner register -h (or, if you use Docker, docker run --rm gitlab/gitlab-runner register -h).

Once you retrieved the Registration Token and Server URL, and decided on the list of tags and the other arguments listed above, register a runner as follows:


sudo gitlab-runner register \ --non-interactive \ --url "https://replaceWithYourServer.com/" \ --registration-token "replaceWithYourToken" \ --executor "docker" \ --docker-image "replaceWithYourImage:latest" \ --description "Replace with your description" \ --tag-list "docker,aws" \ --docker-volumes "/var/run/docker.sock:/var/run/docker.sock"
Code language: PHP (php)

The last part of the command ensures that the executor mounts the Docker daemon socket of the host into every CI job container, which lets you use Docker to build or run containers in a CI job.

Unattended automatic updates

To minimize downtime-issues and the time spent on maintaining the GitLab runner machines, you should automate all update processes as much as possible. Usually, your selected Linux distribution will have some kind of auto-update mechanism that regularly updates system packages. Usually, you can configure it to also update third party dependencies that you installed via the package manager – here: the Docker engine, or the GitLab runner.

There is a small caveat to be aware of if you also run the GitLab server yourself, documented here: for compatibility reasons, the GitLab runner’s major and minor version should stay in sync with the GitLab server‘s major and minor version (however, the runner’s patch version may be newer!). If this is not the case, everything might still work, but there might also be problems. You can choose to handle this in various ways:

  • The easiest way (which also makes sense when you use the GitLab.com SaaS offering) is to simply use the newest GitLab runner (because the SaaS server also always uses the newest version anyway).
  • If you are self-hosting the GitLab server, ask whether the admins of the GitLab server can update the server and runners in lock-step.
  • If you are self-hosting, and the admins are not willing to update the runners (e.g. because they are operated by other departments, internally), then whoever manages the runners also has to write routines that regularly compare the GitLab server’s version to the runner version, and update the runner when appropriate.

Additionally, it may make sense to regularly reboot the entire machine, e.g. once per week or month, to ensure that automatically-installed kernel updates are properly applied, which require a machine reboot.

Example for Debian

To update the Docker engine, I use Debian’s unattended-upgrades feature:

  1. Make sure that the file /etc/apt/apt.conf.d/20auto-upgrades exists, with the content described on the unattended-upgrades page (which also describes how to generate that file, if it is missing).
  2. Edit the file /etc/apt/apt.conf.d/50unattended-upgrades and add a line
    "origin=Docker";
    into the Unattended-Upgrade::Origins-Pattern section (if that section does not exist, create it). This tells unattended-upgrades to also automatically install all updates of the Docker repository.

To update the GitLab runner, edit the file /etc/apt/apt.conf.d/50unattended-upgrades and add a line
"origin=packages.gitlab.com/runner/gitlab-runner";
into the Unattended-Upgrade::Origins-Pattern section. I determined that origin name by looking at the output of the command sudo unattended-upgrade -d which reveals which packages were skipped by unattended-upgrades.

To ensure that all updated services and the Linux kernel are properly applied, create a cronjob that regularly reboots the machine, e.g. once per week.

Finally, you should ensure that the Debian distribution is kept up-to-date. Otherwise, you won’t receive the latest security upgrades anymore. For Debian, run cat /etc/debian_version to determine the Debian Distro version. If it is outdated (colored red on https://wiki.debian.org/LTS) upgrade it. It is highly recommended to read the Debian manual of the specific version of the new distro, to learn about caveats and changes (e.g. here for upgrading Debian 10 to 11). You can also follow tutorials on the Internet, which are faster to follow than the official manual, such as this one (for Debian 9 → 10), but here you should also make sure that to use a tutorial that is specifically written for the right Debian version! If your distro is not outdated, add an entry to your calendar that reminds you to run the distro upgrade. Choose a date that is a few days/weeks before the distro does become oudated, according to the Debian LTS table.

An alternative to upgrading a Linux distro is to regularly re-install the current version of the OS, with all the tools (Docker + runner), in an automated way. There are tools such as Vagrant or Terraform that help you with destroying and recreating (virtual) infrastructure, and tools such as Ansible to install and set up all necessary tools and (Linux) user accounts.

Regular disk cleaning

Assuming that the only purpose of the machine is to be a GitLab runner, you can simply create a cron job that regularly runs “docker system prune -af --volumes“, e.g. once per week, depending on your storage capacity. This command deletes all images, volumes, stopped containers, Docker build caches, etc.

If you are worried about the poor performance of the first run of the CI/CD pipeline that happens right after you ran above command, you can coordinate a scheduled CI/CD pipeline accordingly. For instance, if your regular cleaning cron job runs on Saturdays at 3 AM, you can schedule a CI/CD pipeline for the most important branch(es) on Satuday 3:15 AM.

Conclusion

As I already illustred, there are three good reasons to invest time and money to operate your own GitLab runner(s):

  1. Extend the runner’s technical capabilities (e.g. full access to a Docker daemon socket)
  2. Improve CI/CD pipeline performance (due to local GitLab caching, and due to Docker’s build cache)
  3. Save costs

The last point, costs, requires extra attention. Suppose you used GitLab’s shared SaaS runners, and exceed the included CI/CD minutes (e.g. 400 minutes per month for the free tier, at the time of writing), then GitLab charges you with $10 for every 1000 additional minutes. If your pipelines run for about 5h per day, you’d need to spend $10 every 3 business days. In such scenarios, getting a server for $20 per month (e.g. a VPS) wil be less expensive. However, you also need to account for the time spent on administrative efforts, setting up the runner, and fixing it in case it goes down. Consequently, if you think about having your own runners only to save costs (and not because of performance gains), you might want to hold off until the formula “(serverCosts + adminCosts) < sharedGitLabRunnerCosts” holds.

Leave a Comment