Container image security part 1: Fallacies of image scanners

This article explores why container image vulnerability scanners, such as Trivy, often produce false positives and negatives. It outlines the resulting issues and provides specific examples of these inaccuracies. Additionally, an analysis of eight popular Docker Hub images reveals that Trivy’s open-source version rarely detects the tested CVEs in the image’s primary component compared to Grype.

Table Of Contents

Introduction
What makes an image secure (or vulnerable)
How scanners work (including background on Linux distros)
False negatives
False positives
Hands-on analysis of 8 top-downloaded Docker Hub images
Mitigation options for false negatives
Mitigation options for false positives
Conclusion

Container image security series

This article is part of a multi-part series:

Part 1: Fallacies of image scanners (this article): explains how scanners work and which false positives/negatives they produce. Also available in German on heise online.
Part 2: Minimal container images: provides a list of off-the-shelf, free (and paid) minimal images for bare Linux, PHP, Python, Java, C#, and Node.js. Also available in German on heise online.
Part 3: Building custom minimal container images: how to build your own minimal images based on Chainguard/WolfiOS, Ubuntu Chiseled, and Azure Linux
Part 4: Choosing the best container image: discusses 8 selection criteria for images, describing what they are, why they matter, and how to evaluate them

Introduction

Container image scanners, such as Trivy or Grype, are indispensable tools for detecting vulnerabilities in container images. During a scan, they generate a Software Bill of Materials (SBOM) and compare it with various vulnerability databases to identify potentially exploitable components.

Many IT teams use these scanners under the assumption that they accurately detect all vulnerabilities. However, it is less known that these scanners are prone to errors and often produce both “false positives” and “false negatives”. This leads to significant efforts for the teams to analyze the findings, and can pose security risks when the scanner does not report false negatives.

This article explains how the scanners operate and provides a comprehensive list of all false positives and negatives, accompanied by concrete examples. An analysis of 8 of the most downloaded Docker Hub images using Trivy and Grype reveals significant false negatives in almost all cases, resulting in security blind spots. Finally, solution approaches are outlined.

What makes an image secure (or vulnerable)

When you examine popular official images on Docker Hub, the Docker Hub UI displays several vulnerabilities, even for the most recently pushed tags. Clearly, neither Docker Inc. nor the image maintainers would tolerate the most recent images being constantly vulnerable. Something must be off! Where do these supposed vulnerabilities come from, and are they exploitable?

When you take a closer look at images, you find that they typically contain a large number of software components (such as binaries and libraries). For instance, a PostgreSQL image such as postgres:17.1 does not just include one component, PostgreSQL, but over 200 components, including OpenSSL, the Bash shell, and various system packages and libraries. Some of these components may be vulnerable to exploits, which is the case if and only if a component is loaded into memory while the image runs and the vulnerable part of the component is executed.

The core problem is that the report generated by any image vulnerability scanner (such as Trivy or Grype) is fallible, for various technical reasons (discussed further below). What we, the consumers of images, would ideally want are scanners that produce only “true positives” and “true negatives”. But in reality, these scanners will also create “false negatives” and “false positives”.

Terminology like “false positive” comes from statistics and medicine (see Wikipedia). Take the term “true positive”: the second word, “positive”, is the assessment of what the scanner thinks (and reports to you), and the first word, “true”, is whether this assessment is correct (which you could determine via a detailed, manual analysis). So, in our case, the meanings are:

True positive: the scanner reported a vulnerable component, and the assessment is correct, because during the container’s execution, this component is really loaded to memory and its vulnerable code paths are executed, allowing an attacker to exploit them. You need to react to such findings, e.g., by updating or removing the affected component.
False positive: the scanner reported a vulnerable component. But it cannot be exploited in this image, for various reasons explained further below. The scanner was wrong. Such results are noise, consuming a lot of your time, because you initially don’t know whether a “positive” reported finding is a true positive or a false positive. You need to invest time to determine that manually, also referred to as triage.
True negative: the scanner did not report a component (that is part of the image) as vulnerable, and rightfully so, because that component really was not vulnerable.
False negative: the scanner did not report a component (that is part of the image) as vulnerable, but it should have, because that component actually is vulnerable. This is a dangerous situation, because your image is vulnerable and you don’t even know it.

How scanners work (including background on Linux distros)

The following figure outlines how a scanner like Trivy works when scanning an image:

First, the scanner builds an (internal) SBOM, a list of all software components. The implementation details of each scanner differ, but the general idea is that each tool has several catalogers that know how to identify software packages. Important catalogers include:
- Linux package manager metadata: every scanner first identifies the Linux distribution of the image (e.g. Alpine or Ubuntu) and then looks at the distro’s package manager metadata that records which packages are installed via the package manager (e.g. /var/lib/dpkg/status for apt). Each scanner supports a slightly different set of distros!
- Programming-language package manager metadata: most scanners can index the dependency manifests of various programming languages (e.g., requirements.txt for Python) to detect language-specific components.
- Embedded SBOMs: to allow scanners to detect software copied into the image without the use of a package manager, some scanners (such as Trivy, but not Grype) support finding embedded SBOMs that are part of the image’s file system. This is documented here for Trivy. Both Trivy and Grype also support some proprietary embedded SBOM formats, e.g., for Bitnami images.
- Metadata embedded into binaries: the compilers of some programming languages (such as Rust or Go) can embed metadata (with dependency data) into the native binaries they produce. Scanners like Trivy or Grype understand this metadata and can use it to identify components (and their transitive dependencies).
- Binary scanning: to identify components in non-packaged native binaries (other than Go or Rust), the Syft binary cataloger (which is internally used by Grype) identifies various binaries using path regexes. Trivy’s Open Source edition lacks such a cataloger, but the commercial Aqua Security edition supports this.
Second, the scanner CLI downloads a vulnerability database (unless it already has a recent copy in its local cache). The scanner vendors precompile this file daily. Its size is ~60-70 MB for Trivy and Grype. The file uses a scanner-proprietary format and includes vulnerabilities from various vulnerability databases. The advantages of these precompiled databases are:
- The scanner can efficiently search in them (because they are in a proprietary format)
- The scanner does not need to query each vulnerability database over the Internet at scan-time (many databases don’t have a query interface anyway)
- The scanner can work in air-gapped environments
Finally, the scanner cross-references all components of the internally built SBOM with the entries of its vulnerability database. It reports any match as a positive finding.

As the following sections will soon make apparent, the final step (cross-referencing) is very challenging to implement for the scanners. The underlying reason is that a software package might be distributed in many different repositories, under different identifiers and versions, which are each affected by different vulnerabilities.

This is best explained by an example: suppose your image contains a Python interpreter. Python is distributed in many different forms. You could download it from the python.org homepage, where the binary builds are created from the “upstream” source code. But Python also comes packaged in most Linux distros (like Red Hat Enterprise Linux, SUSE Linux, Alpine, Debian, or Ubuntu). These distros build Python themselves and store the binaries in their official distro repositories.

Here is where it becomes interesting: whenever the maintainers of a Linux distro decide to create a new version of their distro (e.g., Ubuntu 22.04 or Alpine 3.21), they decide which packages (Python, etc.) to include in their official repositories, and in which version. The distros then keep that package version the same, for stability reasons, for the entire lifetime of that distro version, to avoid surprises for their distro users. For instance, in the case of Python, this is Python 3.12.3 for Ubuntu 24.04, as shown here. Because such an old Python version quickly becomes vulnerable, the distro maintainers create backports of security patches. Consequently, the version identifier in Linux distros is typically no longer a normal semantic version (such as Python version 3.12.3), but contains distro-specific suffixes, e.g. 3.12.3-1ubuntu0.5 for Ubuntu, or 3.12.9-r0 for Alpine. Each distro has its own security tracker where the distro maintainers analyze which of their builds are affected by which CVE (see overview of security trackers from the Trivy docs).

Consequently, when scanners like Trivy cross-reference vulnerabilities for components like Python (that were installed via the distro’s package manager), the scanners scope their cross-referencing algorithm (from step 3) to the distro-specific vulnerability database. They only detect vulnerabilities for packages installed from the official repositories. Thus, even if “generic” vulnerability databases (such as the NVD) contain Python vulnerabilities that relate to Python’s upstream source versions (e.g., using a CPE identifier such as cpe:2.3:a:python:python:3.12.3:*:*:*:*:*:*:* whose vulnerabilities are found here), the scanners won’t cross-reference the found Python package with these NVD entries. If they did, it would produce too many false positives.

In summary, that implies that the scanner’s cross-referencing algorithm will fail to detect vulnerable components installed via the distro’s package manager, if they were not installed from the official repository, but from custom repositories (or, e.g., “.deb” files).

Note: cross-referencing for packages that are stored in a single “authoritative” registry (e.g., Node.js packages stored in npmjs.org) is not affected by this problem.

False negatives

The following table summarizes the reasons why scanners omit reporting vulnerabilities:

Reason	Example(s)
The scanner does not detect a component because it was not installed via a package manager.	A multi-stage `Dockerfile` builds a native C/C++ binary in the build stage and copies it to the image’s final stage.
The scanner does not detect a component, even though it was installed by a package manager, because the package manager’s metadata was deleted.	Tools like mint (formerly DockerSlim) delete such metadata when optimizing the image, or people deleted it accidentally or on purpose, as explained in the Malicious Compliance KubeCon 2023 talk and in this KubeCon 2025 follow-up talk.
The scanner detects a component, but it is not known in the scanner tool’s database, which only covers packages from the official Linux distro’s repository.	A package was installed into a Debian-based image from a third-party Debian repository or using a “.deb” file (e.g., done by PostgreSQL).
The scanner detects a component, and (in principle) this component is known as being vulnerable in the scanner’s vulnerability database. However, the component detection used a different identifier for the component than the identifier used in the database. Consequently, the cross-referencing process fails.	When attempting to use Trivy or Grype to scan SBOMs produced by other tools (e.g., by Bazel or apko), no vulnerabilities will be found (e.g., for the SBOMs provided by Google’s distroless images or Chainguard’s images).
Problems with the scanners’ vulnerability database.	– The scanner vendor did not correctly update/build their bundled database, e.g., missing an entire vulnerability database provider – The scanner tool did not update the locally cached database for a long time, maybe due to connectivity issues – The vulnerability in the component’s code has not been reported by anyone yet, or it has not been given a CVSS rating yet (see report of growing NVD backlog)

False positives

The following table summarizes the reasons when scanners report vulnerabilities that are not really vulnerable:

Reason	Example(s)
The scanner finds a component in the image that does have vulnerabilities, but that component is either not loaded at all throughout the container’s lifetime, or only those parts of it are used that are not vulnerable.	You run an image of a database server that also includes a vulnerable Perl interpreter, but you never use Perl.
The scanner thinks it found a component, because it was listed in the package manager’s metadata. However, the files/binaries of that component have been (manually) deleted, without properly updating the metadata.	In your `Dockerfile`, you included lines such as “`RUN rm <path-to-component>`” instead of “`RUN apt-get -y remove <component>`”.
There is a disagreement between vulnerability database maintainers and software maintainers regarding whether a component is actually vulnerable. Essentially, the software maintainers know about the vulnerability, but don’t plan to fix it anytime soon, for various reasons.	– CVE-2005-2541 is considered a high-severity vulnerability, but Debian considers it “intended behavior” (a feature). – The software maintainers consider the vulnerability to be only a minor problem and have assigned a low priority, so they might fix it only in a few months or years.
The scanner is unable to detect security backports that some Linux distributions make for certain tools.	CVE-2020-8169 indicates that `curl` versions `7.62.0` though `7.70.0` have a vulnerability that is fixed in `7.71.0`. However, Debian Buster’s `curl` package has a backport of the fix in version `7.64.0-4+deb10u2` (see security-tracker.debian.org and DSA-4881-1) which the scanner does not detect (the scanner detects version `7.64.0` and reports it as vulnerable).

Hands-on analysis of 8 top-downloaded Docker Hub images

To understand the impact of false negatives, the following table shows the result of an analysis of eight of the most popular images hosted on Docker Hub (determined by this search). For each image, a representative CVE of the primary component (e.g., the Python interpreter when analyzing the python image) was chosen, and a vulnerable version of the primary component was picked. Then, one (or more) image tags of that specific, vulnerable version were scanned with Trivy and Grype to determine whether these scanners would detect the primary component, and if so, whether the scanner would cross-reference the CVE as expected.

Image name	Scanned tag(s)	Tested CVE	Trivy detected component	Trivy found CVE	Grype detected component	Grype found CVE
fluent/fluent-bit	`3.0.0`	CVE-2024-4323	❌	❌	✅ (binary cataloger)	✅
memcached	`1.6.21-alpine`, `1.6.21-bookworm`	CVE-2023-46853	❌	❌	✅ (binary cataloger)	✅
nginx	`1.26.0-alpine`, `1.26.0-bookworm`	CVE-2024-7347	✅	❌	✅ (apk/deb cataloger)	✅/❌ ⚠ ¹⁾
redis	`7.4.1-alpine`, `7.4.1-bookworm`	CVE-2024-46981	❌	❌	✅ (binary cataloger)	✅
postgres	`17.2-alpine`, `17.2-bookworm`	CVE-2025-1094	Alpine: ❌ Bookworm: ✅	❌	✅ (binary cataloger)	Alpine: ✅ Bookworm: ❌
python	`3.12.3-alpine`, `3.12.3-bookworm`	CVE-2024-6232	❌ ²⁾	✅ ⚠ ²⁾	✅ ⚠ ³⁾(binary + deb cataloger)	✅ ⚠ ³⁾
node	`20.18.1-alpine`, `20.18.1-bookworm`	CVE-2025-23083	❌	❌	✅ (binary cataloger)	✅
mongo	`8.0.1`	CVE-2024-10921	✅	❌	✅ (deb cataloger)	❌

Remarks:

1) Grype’s internal vulnerability database claims there was no newer NGINX version that fixes the vulnerability (even though there is!). Thus, running “grype nginx:1.26.0-bookworm --only-fixed”, would not report CVE-2024-7347
2) In the Alpine image, Trivy does not detect Python. In the Bookworm image, Trivy misdetects Python with the identifier “pkg:deb/debian/libpython3.11-stdlib@3.11.2-6?arch=amd64&distro=debian-12.5”, and thus reports a large number of false positive CVEs affecting Python 3.11.2
3) In the Bookworm image, Grype detects Python twice! The binary cataloger detects the correct version (3.12.3), the deb cataloger incorrectly detects version 3.11.2 (same problem as Trivy, see note 2). Grype consequently reports a large number of false positive CVEs affecting Python 3.11.2

As the results show, Grype’s detection of the primary component is often better than Trivy’s, thanks to Grype’s binary cataloger, which (“coincidentally”) covers the primary components of those images picked for this test.

Mitigation options for false negatives

As the above analysis has shown, the likelihood is high that your scanner will produce false negatives for the primary component of an image.

To avoid such false negatives, the following mitigation options are not a good idea:

Just use Grype (or use multiple scanners in parallel): while Grype produces fewer false negatives than scanners like Trivy (which was also reported by Chainguard), Grype still does yield false negatives occasionally, and will produce more false positives, on average.
Wait and hope that scanner vendors fix the problem: various GitHub discussions indicate that this problem won’t be fixed by the scanner vendors anytime soon. To properly avoid the false negatives demonstrated above, image producers and scanner vendors would have to collaborate to define a standard where image producers embed metadata files (that universally identify non-packaged components like Python) that every scanner understands, helping the scanner with correctly cross-referencing the component.

The only remaining option is a manual analysis, where you repeat the process of the above example analysis with the images you use.

First, collect a list of all images you use, including:

Images your development teams use as base images, such as interpreted language runtimes, such as Python, .NET, Java, or Node.js
Third-party images you run in your infrastructure, e.g., reverse proxies, web servers, databases, message brokers, SSL certificate renewal bots, workflow engines, observability platforms, CI/CD runners, Kubernetes operators, etc.

To reduce the analysis efforts, it is advisable to specify which of these images are really “security-critical” and which ones are not, marking them “out of scope”.

Given your reduced list of images, repeat the following steps for every image to determine whether you are affected by false negatives:

For the image’s primary component, research 2-3 CVEs that are recent (e.g., up to 1 year old)
For each CVE, research a concrete version (major.minor.patch) that is affected by the CVE
To verify whether your scanner(s) can even identify the primary component, run your scanner(s) with specific command-line arguments to print the internally-generated SBOM to a file. In that SBOM file, search for the primary component’s name and version. Ignore entries with identifiers that start with “pkg:oci/…” or “pkg:docker/…” because these relate to the identifier of the scanned image, not of the actual component.
Verify that if you cannot find the component in the SBOM file, the scanner also should not find the CVE.
If you do find the component in the SBOM file and the scanner does report the expected CVE(s), this image is not affected by false negatives for the primary component.
If you find the component in the SBOM file, but the scanner does not find the expected CVE(s), you still need to double-check whether this is really a false negative (as it could also be a true negative):
- Indicators for false negatives (i.e., broken cross-referencing):
  - For Trivy, look for the CVE number in Trivy’s vulnerability database. If it is missing or only known for Linux distros other than the one used in your analyzed image, it is a false negative.
  - For Grype, the “grype db search” command searches Grype’s vulnerability database. Either provide the arguments --vuln CVE-<year>-<cve-number> or --pkg <name, e.g. python> to check whether Grype knows the CVE under the same identifier as the one from the SBOM file.
- Indicator for true negative (i.e., correct cross-referencing): the primary component was installed from the distro’s official package repository (check the image’s Dockerfile), and the scanner correctly identified the primary component (with the distro-specific version), and the CVE is known to be not vulnerable in that specific distro (example that shows that CVE-2024-4032 is not present in Python 3.12 in Ubuntu 24.04, because of a back-port of a security fix): if all these conditions hold, the scanner should not report the CVE (→ true negative)

Suppose you completed your analyses and found that your preferred image scanner is affected by false negatives for, say, 3 specific images. You then have these options:

Build custom tooling that you run regularly, which keeps track of those 3 specific images (and the deployed versions). Your tooling needs to look up vulnerabilities of these images by different means, e.g., by querying the osv.dev API with the component name and version.
Research whether you can buy or build alternative images that are hardened for security, containing a minimal (and frequently-patched) number of components, for which your scanner correctly identifies the primary component. Examples include free images, such as Google’s distroless, or commercial images, such as Bitnami, RapidFort, or Chainguard. Part 2 of this article series has more details.

Mitigation options for false positives

When scanners report a “positive” finding, you will need to invest time to triage whether the positive is true or false.

In general, it’s recommended to set your scanner’s CLI argument that omits reporting vulnerabilities for which no fixed version is available yet (such as Trivy’s “--ignore-unfixed”). This greatly reduces the number of reported positives. Irrespective of whether those positives are true or false, you could not do much either way, since there is no fixed version available.

If you determined that the reported finding is a false positive, you have the following options:

Use your scanner’s ignore list feature: scanners like Trivy or Grype offer two variants:
- Proprietary ignore files, where you specify a list of CVEs which the scanner then ignores for all images that it scans. Be careful with this approach, because a specific CVE might be a False Positive for image #1, but not for image #2.
- VEX files (using standards such as OpenVEX) that allow you to specify the “CVE + image” combination.
Talk to the maintainers: the most typical kind of false positive affects low-level system libraries that just need an update, or do not even need to be part of the image. Rather than maintaining false positive lists on your end for such cases, you can also try whether removing or updating the vulnerable component gets rid of the finding, and if so, report it to the official image maintainers (e.g. via a GitHub issue). For instance, see this issue where users asked the redis image maintainers to replace or rebuild the gosu binary, which contains many false positive vulnerabilities.

Finally, like for false negatives, you can check whether buying or building alternative images hardened for security is a viable option. These images will reduce both false negatives and false positives and thus significantly reduce headaches for your teams.

Conclusion

For any off-the-shelf image, there is a considerable chance that scanners like Trivy do not detect vulnerabilities in the primary component of the image. With no immediate fix (by the scanner vendors) in sight, you currently need to invest significant efforts in triaging such false negatives in the images you use.

It remains to be seen whether the community can improve the way images are built and scanned to mitigate this problem. For instance, images could include further metadata that assists scanners with finding vulnerabilities. Alternatively, image producers could provide a “distro-stable” variant of the image that installs the primary component from the Linux distro’s official package repository, automatically pushing new image updates whenever the package is refreshed in the distro’s package repository.

Introduction

What makes an image secure (or vulnerable)

How scanners work (including background on Linux distros)

False negatives

False positives

Hands-on analysis of 8 top-downloaded Docker Hub images

Mitigation options for false negatives

Mitigation options for false positives

Conclusion

Leave a Comment Cancel reply