Vulnerability Correlation Overview
Overview
Vulnerability correlation is a critical feature in Trustify, designed to automatically link security advisories to the software packages listed in a Software Bill of Materials (SBOM). This process enables developers and security teams to quickly identify which components in their software supply chain are affected by known vulnerabilities. By ingesting data from various advisory sources and mapping it to a rich data model, Trustify provides a clear and actionable view of an application’s security posture.
This document provides a technical deep-dive into how the vulnerability correlation logic works, from data ingestion to the database schema and the query mechanisms that tie everything together.
Data Ingestion
Trustify’s correlation process begins with the ingestion of security advisories from multiple sources. Each source has its own format and conventions, and Trustify has dedicated loaders to parse and normalize this data.
Advisory Sources
Trustify primarily supports two industry-standard advisory formats:
- Open Source Vulnerability (OSV): A format designed for describing vulnerabilities in open-source software. OSV advisories are typically package-centric, providing clear information about affected package ecosystems, names, and version ranges.
- Common Security Advisory Framework (CSAF): A more comprehensive format often used by vendors to describe vulnerabilities in their products. CSAF advisories are product-centric and can describe complex relationships between products and the components they contain.
Ingestion Process
When an advisory is ingested, the following steps occur:
- Parsing: The appropriate loader (
OsvLoaderorCsafLoader) parses the advisory file. - Advisory & Vulnerability Creation:
- An
Advisoryentity is created in the database to represent the advisory document itself, storing metadata like its identifier, issuer, and publication date. - Vulnerability identifiers (e.g., CVEs) are extracted from the advisory. For each unique identifier, a
Vulnerabilityentity is created. - A link is established between the
Advisoryand its associatedVulnerabilityentities.
- An
- Status Record Creation: This is the core of the ingestion-side correlation logic. The loader extracts information about which products or packages are affected, fixed, or not affected by the vulnerability. This information is used to create
purl_statusandproduct_statusrecords.
PURL and Product Status
purl_status: These records create a direct link between a vulnerability and a versioned package, identified by a Package URL (PURL). For OSV advisories, this is not complex, as it is possible to map package field to PURL and version range information. The loader parses these ranges and createspurl_statusentries with a status (e.g., “affected”) and the corresponding version range. This is also used for CSAF statuses which define PURL identities for components.product_status: These records are used for more abstract, product-level statuses, which are common in CSAF advisories. A CSAF advisory might state that “Red Hat Enterprise Linux 9” is affected. TheCsafLoaderuses aStatusCreatorto resolve these product definitions, trace their relationships to underlying components (which may be identified by PURLs or CPEs), and createproduct_statusrecords.- If the advisory provides a CPE for the product context, the
StatusCreatorcaptures this and stores it in thecontext_cpe_idfield of the status record (both pul and product status). This allows Trustify to handle both package-centric and product-centric advisory data with the correct context. OSV advisories do not typically define a product CPE context.
Database Schema
The correlation logic relies on a well-defined set of tables that model the relationships between advisories, vulnerabilities, and software components.
Core Entities
advisory: Stores metadata about each ingested advisory.vulnerability: Stores information about each unique vulnerability (e.g., CVE).advisory_vulnerability: A join table linking advisories to the vulnerabilities they describe.base_purl,versioned_purl,qualified_purl: These tables work together to store and normalize Package URLs, separating the version-independent parts from the versioned parts.version_range: Stores version range information, including the versioning scheme (e.g., SemVer, Maven).purl_status: The central table for package-centric correlation. It links anadvisoryandvulnerabilityto abase_purland aversion_range, with a specificstatus(e.g., “affected”, “fixed”). It also contains an optionalcontext_cpe_idto link the status to a specific product context.product_status: The central table for product-centric correlation. It links anadvisoryandvulnerabilityto aproductand aversion_range, with a specificstatus. It also contains an optionalcontext_cpe_idto link the status to a specific product context.
Entity Relationships
The schema is designed to efficiently query for vulnerabilities. A purl_status record effectively creates a tuple of (Advisory, Vulnerability, Status, Base PURL, Version Range), which directly answers the question of whether a package is affected by a vulnerability according to a specific advisory.
Correlation Logic (The Query Side)
With the data ingested and stored, the final piece of the puzzle is querying it to determine the vulnerability status of an SBOM’s components.
The Goal
The primary goal of the query-side logic is to answer the question: “Given a list of PURLs from an SBOM, what are all the known vulnerabilities that affect them?”
This logic is primarily located in the modules/fundamental/src/purl/model/details/purl.rs file, within the PurlDetails::from_entity function.
The PurlDetails::from_entity Function
This function is the heart of the correlation engine. When asked for the details of a specific PURL (including its version), it performs the following steps:
- Check for SBOM CPE Context: The logic first determines if the SBOM containing the PURL describes a specific product by checking if it is associated with a CPE.
- Query
purl_status: It initiates a query on thepurl_statustable, filtering for records that match thebase_purl_idof the package in question. - Apply CPE Context Filter (if applicable): If the SBOM has a CPE context, the query is further filtered. It will only return
purl_statusrecords where thecontext_cpe_ideither matches the SBOM’s CPE or isNULL. This ensures that only advisories relevant to the SBOM’s specific product context are considered. If the SBOM has no CPE context, this filtering step is skipped. - Version Matching: The most critical step is the version check. The query uses a custom database function,
VersionMatches, which takes the package’s version as input and compares it against theversion_rangestored for eachpurl_statusrecord. This function understands different versioning schemes (like SemVer) and can correctly determine if a version falls within an affected range (e.g.,>=1.2.0, <1.3.0). Only the statuses with matching version ranges are returned. - Query
product_status: To account for product-centric advisories, a second, more complex query is executed:- First, it finds all SBOMs that are known to contain the PURL being queried.
- Then, it finds all
product_statusrecords that are associated with those SBOMs (via a link fromproduct_versiontosbom_id). - This query is also filtered by the SBOM’s CPE context, if one exists.
- This effectively bridges the gap, allowing a PURL to inherit the vulnerability status of the product it belongs to, but only when the context matches.
- Data Aggregation: The results from both the
purl_statusandproduct_statusqueries are collected and aggregated. They are grouped by advisory to provide a clean, comprehensive list of all advisories that affect the given PURL.
Putting It All Together (Example Workflow)
Here is a concrete example using a real-world CSAF advisory and a corresponding SBOM.
The Data
- CSAF Advisory (
cve-2023-0044.json): A Red Hat advisory forCVE-2023-0044. It states that the componentio.quarkus/quarkus-vertx-httpisknown_affectedwithin the product “Red Hat build of Quarkus”, which has a CPE ofcpe:/a:redhat:quarkus:2. - SBOM (
quarkus-bom-2.13.8.Final-redhat-00004.json): An SPDX SBOM for the “quarkus-bom”. This SBOM describes the product it belongs to with the CPEcpe:/a:redhat:quarkus:2.13::el8and listspkg:maven/io.quarkus/quarkus-vertx-http@2.13.8.Final-redhat-00004as one of its packages.
The Process
- Ingestion: The Red Hat CSAF advisory for
CVE-2023-0044is ingested into Trustify.- An
Advisoryentity is created for the documentCVE-2023-0044. - A
Vulnerabilityentity is created forCVE-2023-0044. - The
CsafLoaderprocesses theproduct_treeandvulnerabilitiessections. It creates aproduct_statusrecord for the componentio.quarkus/quarkus-vertx-httpwith a status ofknown_affected. This status record is linked to the product contextcpe:/a:redhat:quarkus:2.
- An
- SBOM Analysis: The
quarkus-bomSBOM is uploaded and processed.- Trustify parses the SBOM and stores all the packages, including
pkg:maven/io.quarkus/quarkus-vertx-http@2.13.8.Final-redhat-00004. - It also records that this SBOM describes the product identified by
cpe:/a:redhat:quarkus:2.13::el8.
- Trustify parses the SBOM and stores all the packages, including
- Correlation Query: A user requests the vulnerability status for the uploaded SBOM. For each package in the SBOM, Trustify’s correlation logic runs. When it gets to
pkg:maven/io.quarkus/quarkus-vertx-http@2.13.8.Final-redhat-00004, the following happens:- The
PurlDetails::from_entityfunction is called. - The logic identifies that the SBOM has a CPE context:
cpe:/a:redhat:quarkus:2.13::el8. - The query against the
product_statustable finds theknown_affectedstatus forio.quarkus/quarkus-vertx-httprelated toCVE-2023-0044. - The query then filters this result based on the CPE context. It checks if the status’s context (
cpe:/a:redhat:quarkus:2) is compatible with the SBOM’s context (cpe:/a:redhat:quarkus:2.13::el8). Sincecpe:/a:redhat:quarkus:2.13is a version of the productcpe:/a:redhat:quarkus:2, the contexts are compatible, and the status is considered a valid match. - The
VersionMatchesfunction would then be called to ensure the specific version2.13.8.Final-redhat-00004falls within the range specified by the advisory (in this specific CSAF, the range is implicit, but the logic still applies).
- The
- Result: The user is shown that their SBOM is vulnerable to
CVE-2023-0044. The details would indicate that thequarkus-vertx-httppackage is affected and that the vulnerability is relevant in the context of the “Red Hat build of Quarkus” product.