This article demonstrates how you can do infrastructure testing for Ansible roles and playbooks. I explain how the tools Vagrant and Molecule+Docker let you easily provision temporary VMs or Docker containers in which you can experimentally run your Ansible roles/playbooks, or even run unit testing in Continuous Integration.
Introduction to infrastructure testing
Ansible is a CLI tool that provisions other machines. If you are new to Ansible, check out my introduction guide.
When creating Ansible playbooks (or writing your own roles), it is easy to make mistakes. Thus, you need a way to test your playbooks. Welcome to the world of infrastructure testing., which is about applying your knowledge about software testing to infrastructure-tools, making sure that these tools really do what you think they do. When relating software testing to infrastructure testing, there are two levels of testing we often want:
- Smoke testing: the minimal level of testing you can get by with (“better than nothing”)
- In software testing, this would mean to just compile and start the application, then orderly shut it down again, and check that there are no errors
- In infrastructure testing, this means to simply run the ansible playbook, maybe even run it twice (to check its idempotency), and check that Ansible does not raise any errors
- Unit/System testing: a more fail-safe way of testing
- In software testing, we run individual components (unit tests) or the whole system (system tests), put it into a defined initial state, then run some functionality, and compare the output of the system with some hand-crafted expected output, the “test oracle”. The test fails if the output does not match the expected one, or if the component or system crashed.
- In infrastructure testing, we can do the same as in software testing: unit tests would mean to test an Ansible role, and system tests refer to testing an entire playbook.
So the question is: what tooling can I use to write such kinds of tests, and how do I get testing-infrastructure set up easily (locally or in CI), to avoid that I need to use (real) hardware of my production system?
There are two tools at your disposal:
- Vagrant: Vagrant lets you do smoke testing of your Ansible playbook or role, by creating and destroying locally-running virtual machines (VMs), using various hypervisors (e.g. VirtualBox)
- Molecule & Docker: Molecule is a testing framework (like JUnit, pytest, etc.) for Ansible roles and playbooks. Molecule uses Docker containers to provision temporary environments, which are faster to spin up than VMs, but require a few tricks to get
systemd
-based services to work. Molecule can be used locally and in CI/CD environments.
Using Vagrant for infrastructure testing
Vagrant is a CLI tool that uses recipes (a Vagrantfile
) to download, start and provision VMs. See Vagrant introduction and use cases to learn more about Vagrant and how to use it. The basic idea is that with the right Vagrantfile
, all you need to do is to run “vagrant up
“, and a few minutes later you have one (or more) VMs with a naked Linux distro installed, which you can then provision with Ansible.
There are two approaches to integrate Vagrant with Ansible:
- Install Ansible and Vagrant on the host, and use Vagrant’s Ansible provisioner, as explained the Ansible docs here.
- What happens here is that once “
vagrant up
” has completed, Vagrant will automatically call theansible-playbook
CLI (on the host) for you, pointing it to an inventory file that Vagrant generated for that VM. - This approach is the one I would generally recommend. It is easy to get it to work on any UNIX-based hosts, e.g. macOS or Linux. On Windows, getting Vagrant and Ansible to work is more complicated, because you must use WSL (because Ansible requires Linux). However, then you also have to get Vagrant to work inside WSL, which is considerably more complicated, and official support is experimental. There are guides such as this one, but your mileage may vary.
- What happens here is that once “
- Install only Vagrant on the host, and install Ansible into one of the VMs
- There are two sub-variants.
- Create a dedicated control VM that contains only Ansible, see here for pointers. Make sure to set
config.ssh.insert_key
tofalse
for all the VMs that are not the control node - Have Vagrant install Ansible into the VM that shall be provisioned, using the
ansible_local
provisioner.
- Create a dedicated control VM that contains only Ansible, see here for pointers. Make sure to set
- This approach is the most “platform-independent” approach, and is probably the most stable approach if some members of your team use Windows on the host, where approach #1 is difficult to set up.
- However, these approaches are more resource intensive (extra control VM), or “unclean” when installing Ansible into the same VM that Ansible is supposed to provision (installing Ansible “taints” the VM).
- There are two sub-variants.
Using Molecule for infrastructure testing
While the Molecule docs focus on describing how to test Ansible roles, you can also test complete playbooks, described next.
To get started, you need to install the Python packages for Molecule and Docker, e.g. via pip3 install molecule[docker]
In your project directory (where your playbook.yml
is stored), run molecule init scenario
to create a new “molecule” folder.
Delete the files create.yml
, destroy.yml
and INSTALL.rst
in the molecule/default
directory, because they are generally not necessary.
Open the molecule/default/converge.yml
file and make it look like this:
# converge.yml
---
- name: Converge
hosts: all
# Tell Molecule not to test a role, but run our playbook
- import_playbook: ../../playbook.yml
Code language: YAML (yaml)
Next, we have to change the molecule/default/molecule.yml
file. The platforms
section of that file defines which Docker container(s) to spin up. You could have multiple entries there, which tells Molecule to spin up a container for each defined image, and run your playbook inside it.
Here is an example:
# molecule.yml
---
dependency:
name: galaxy
driver:
name: docker
platforms: # Configures the list of environments to which Molecule applies our playbook/role
- name: instance
image: "geerlingguy/docker-rockylinux8-ansible:latest"
pre_build_image: true
# The following 4 lines are needed only for making systemd work
command: "" # disables that Molecule overrides the Docker container's start command and instead run's the init-system binary
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
privileged: true
provisioner:
name: ansible
verifier:
name: ansible
Code language: YAML (yaml)
There is one big caveat: your playbook will most likely use Ansible modules/roles that ensure that some Linux service is started (or restart one or more services). This requires a working init system (e.g. upstart, systemd, …). Normally, Docker images do not contain an init system, because init sytems require a very high degree of Linux privileges, which containers (by default) should not be given. But here we do need an init system, which is why privileged: true
is set in the above file. We also need a Docker image that contains an init system. See here for a list of available images maintained by a very active Ansible community member, Jeff Geerling. On that page, scroll down to the “Container Images for Ansible Testing” section, and check the Maintained column, to find which images are (still) actively maintained.
Systemd containers and WSL2
If you are on Windows and use WSL2, you cannot use Docker-containers that internally use systemd, because the WSL2-environment itself does not come with systemd. You need a “host” OS with properly-working systemd, as it seems. You can achieve this with a Hyper-V or Virtualbox VM on Windows, into which you install Docker, Ansible and Molecule.
You can now use the following commands (make sure the working directory of your terminal is the project root):
molecule converge
: creates a new Docker container and runs your playbook or role in it (applying the playbook/role is what Molecule calls “converging”)molecule destroy
: destroys a possibly-existing Docker containermolecule test
: destroys and recreates the container, and runs your playbook in itmolecule login
: gives you a shell into the running Docker container, which is useful for debugging failing playbooks/roles
There are quite a few more commands, see docs. You can deduce what these commands do in detail, by looking at the scenario-default list here.
- The content of the scenario-default-list snippet (that starts with
scenario:
) has the format<name-of-command>_sequence
. For instance, whatever steps are shown increate_sequence
are what Molecule does when you runmolecule create
. - A few details about these steps:
dependency
: if you were to test a role, which requires other roles, you could add arequirements.yml
file in thedefault
scenario directory of the role, and Molecule would install them in this stepcreate
: creates the Docker containerprepare
: if you create aprepare.yml
playbook in thedefault
scenario directory of a role, Molecule runs itcleanup
: if you create acleanup.yml
playbook, Molecule runs it. This is for cleaning up test infrastructure that may not be present on the instance that will be destroyed. The primary use-case is for “cleaning up” changes that were made outside of Molecule’s test environment. For example, remote database connections or user accounts. Intended to be used in conjunction with theprepare
step, to modify external resources when required.converge
: Molecule runs your role or playbook (whatever you defined in theconverge.yml
file) in the running Docker containerdestroy
: destroys the running Docker containeridempotence
: runsconverge
twice, and complains if Ansible reported that that something has changedverify
: runs yourverify.yml
playbook, which corresponds to the “test oracle” of software testing. Here you write Ansible tasks that verify whether all the services and files installed by the role/playbook are really working. For instance, for testing a playbook that installs a web server, you would have a file such as this.
- You can change the
scenario
sequence as you see fit in yourmolecule.yml
file!
If you want to know how to run molecule in CI (GitHub Actions), see here.
Conclusion
The presented tooling, Vagrant and Molecule, improve the life of an automation engineer significantly. During the experimentation phase, where you still figure out the structure of the role or playbook, you can now work faster and save money, because you don’t need to pay for real hardware, nor wait for the provisioning of virtual hardware. In addition, Molecule’s ability to run in CI and to write unit/system tests (as you know them from software testing) improves the infrastructure code quality, and catches problems early.