This article demonstrates how you can do infrastructure testing for Ansible roles and playbooks. I explain how the tools Vagrant and Molecule+Docker let you easily provision temporary VMs or Docker containers in which you can experimentally run your Ansible roles/playbooks, or even run unit testing in Continuous Integration.
Introduction to infrastructure testing
Ansible is a CLI tool that provisions other machines. If you are new to Ansible, check out my introduction guide.
When creating Ansible playbooks (or writing your own roles), it is easy to make mistakes. Thus, you need a way to test your playbooks. Welcome to the world of infrastructure testing., which is about applying your knowledge about software testing to infrastructure-tools, making sure that these tools really do what you think they do. When relating software testing to infrastructure testing, there are two levels of testing we often want:
- Smoke testing: the minimal level of testing you can get by with (“better than nothing”)
- In software testing, this would mean to just compile and start the application, then orderly shut it down again, and check that there are no errors
- In infrastructure testing, this means to simply run the ansible playbook, maybe even run it twice (to check its idempotency), and check that Ansible does not raise any errors
- Unit/System testing: a more fail-safe way of testing
- In software testing, we run individual components (unit tests) or the whole system (system tests), put it into a defined initial state, then run some functionality, and compare the output of the system with some hand-crafted expected output, the “test oracle”. The test fails if the output does not match the expected one, or if the component or system crashed.
- In infrastructure testing, we can do the same as in software testing: unit tests would mean to test an Ansible role, and system tests refer to testing an entire playbook.
So the question is: what tooling can I use to write such kinds of tests, and how do I get testing-infrastructure set up easily (locally or in CI), to avoid that I need to use (real) hardware of my production system?
There are two tools at your disposal:
- Vagrant: Vagrant lets you do smoke testing of your Ansible playbook or role, by creating and destroying locally-running virtual machines (VMs), using various hypervisors (e.g. VirtualBox)
- Molecule & Docker: Molecule is a testing framework (like JUnit, pytest, etc.) for Ansible roles and playbooks. Molecule uses Docker containers to provision temporary environments, which are faster to spin up than VMs, but require a few tricks to get
systemd-based services to work. Molecule can be used locally and in CI/CD environments.
Using Vagrant for infrastructure testing
Vagrant is a CLI tool that uses recipes (a
Vagrantfile) to download, start and provision VMs. See Vagrant introduction and use cases to learn more about Vagrant and how to use it. The basic idea is that with the right
Vagrantfile, all you need to do is to run “
vagrant up“, and a few minutes later you have one (or more) VMs with a naked Linux distro installed, which you can then provision with Ansible.
There are two approaches to integrate Vagrant with Ansible:
- Install Ansible and Vagrant on the host, and use Vagrant’s Ansible provisioner, as explained the Ansible docs here.
- What happens here is that once “
vagrant up” has completed, Vagrant will automatically call the
ansible-playbookCLI (on the host) for you, pointing it to an inventory file that Vagrant generated for that VM.
- This approach is the one I would generally recommend. It is easy to get it to work on any UNIX-based hosts, e.g. macOS or Linux. On Windows, getting Vagrant and Ansible to work is more complicated, because you must use WSL (because Ansible requires Linux). However, then you also have to get Vagrant to work inside WSL, which is considerably more complicated, and official support is experimental. There are guides such as this one, but your mileage may vary.
- What happens here is that once “
- Install only Vagrant on the host, and install Ansible into one of the VMs
- There are two sub-variants.
- This approach is the most “platform-independent” approach, and is probably the most stable approach if some members of your team use Windows on the host, where approach #1 is difficult to set up.
- However, these approaches are more resource intensive (extra control VM), or “unclean” when installing Ansible into the same VM that Ansible is supposed to provision (installing Ansible “taints” the VM).
Using Molecule for infrastructure testing
While the Molecule docs focus on describing how to test Ansible roles, you can also test complete playbooks, described next.
To get started, you need to install the Python packages for Molecule and Docker, e.g. via
pip3 install molecule[docker]
In your project directory (where your
playbook.yml is stored), run
molecule init scenario to create a new “molecule” folder.
Delete the files
INSTALL.rst in the
molecule/default directory, because they are generally not necessary.
molecule/default/converge.yml file and make it look like this:
# converge.yml - name: Converge hosts: all # Tell Molecule not to test a role, but run our playbook - import_playbook: ../../playbook.ymlCode language: YAML (yaml)
Next, we have to change the
molecule/default/molecule.yml file. The
platforms section of that file defines which Docker container(s) to spin up. You could have multiple entries there, which tells Molecule to spin up a container for each defined image, and run your playbook inside it.
Here is an example:
# molecule.yml dependency: name: galaxy driver: name: docker platforms: # Configures the list of environments to which Molecule applies our playbook/role - name: instance image: "geerlingguy/docker-rockylinux8-ansible:latest" pre_build_image: true # The following 4 lines are needed only for making systemd work command: "" # disables that Molecule overrides the Docker container's start command and instead run's the init-system binary volumes: - /sys/fs/cgroup:/sys/fs/cgroup:ro privileged: true provisioner: name: ansible verifier: name: ansibleCode language: YAML (yaml)
There is one big caveat: your playbook will most likely use Ansible modules/roles that ensure that some Linux service is started (or restart one or more services). This requires a working init system (e.g. upstart, systemd, …). Normally, Docker images do not contain an init system, because init sytems require a very high degree of Linux privileges, which containers (by default) should not be given. But here we do need an init system, which is why
privileged: true is set in the above file. We also need a Docker image that contains an init system. See here for a list of available images maintained by a very active Ansible community member, Jeff Geerling. On that page, scroll down to the “Container Images for Ansible Testing” section, and check the Maintained column, to find which images are (still) actively maintained.
Systemd containers and WSL2
You can now use the following commands (make sure the working directory of your terminal is the project root):
molecule converge: creates a new Docker container and runs your playbook or role in it (applying the playbook/role is what Molecule calls “converging”)
molecule destroy: destroys a possibly-existing Docker container
molecule test: destroys and recreates the container, and runs your playbook in it
molecule login: gives you a shell into the running Docker container, which is useful for debugging failing playbooks/roles
- The content of the scenario-default-list snippet (that starts with
scenario:) has the format
<name-of-command>_sequence. For instance, whatever steps are shown in
create_sequenceare what Molecule does when you run
- A few details about these steps:
dependency: if you were to test a role, which requires other roles, you could add a
requirements.ymlfile in the
defaultscenario directory of the role, and Molecule would install them in this step
create: creates the Docker container
prepare: if you create a
prepare.ymlplaybook in the
defaultscenario directory of a role, Molecule runs it
cleanup: if you create a
cleanup.ymlplaybook, Molecule runs it. This is for cleaning up test infrastructure that may not be present on the instance that will be destroyed. The primary use-case is for “cleaning up” changes that were made outside of Molecule’s test environment. For example, remote database connections or user accounts. Intended to be used in conjunction with the
preparestep, to modify external resources when required.
converge: Molecule runs your role or playbook (whatever you defined in the
converge.ymlfile) in the running Docker container
destroy: destroys the running Docker container
convergetwice, and complains if Ansible reported that that something has changed
verify: runs your
verify.ymlplaybook, which corresponds to the “test oracle” of software testing. Here you write Ansible tasks that verify whether all the services and files installed by the role/playbook are really working. For instance, for testing a playbook that installs a web server, you would have a file such as this.
- You can change the
scenariosequence as you see fit in your
If you want to know how to run molecule in CI (GitHub Actions), see here.
The presented tooling, Vagrant and Molecule, improve the life of an automation engineer significantly. During the experimentation phase, where you still figure out the structure of the role or playbook, you can now work faster and save money, because you don’t need to pay for real hardware, nor wait for the provisioning of virtual hardware. In addition, Molecule’s ability to run in CI and to write unit/system tests (as you know them from software testing) improves the infrastructure code quality, and catches problems early.