cpe-guesser 2.0 released - Multi-Source CPE Imports, Better Ranking, and Greater Autonomy Beyond NVD
cpe-guesser 2.0 released
Overview
Version 2.0 brings major improvements to CPE import, ranking, and CVE v5 data handling. This release focuses on better import performance, broader format support, improved search relevance, and more robust indexing for vendor and product matching.
A notable change in this release is that cpe-guesser is no longer limited to NVD as its only practical CPE source. In addition to the NVD feeds, it can also leverage the Vulnerability-Lookup dump available at https://vulnerability.circl.lu/dumps/, providing additional CPE sources and more autonomy from the previously NVD-only source model.
Highlights
Improved search and ranking
- Improved search ranking using CPE rank scores.
- Enhanced server-side lookup ranking with
rank:cpescoring. - Reset of CVE v5 rank state before each import to ensure consistent ranking behavior.
CVE v5 import and indexing enhancements
- Added CVE v5 NDJSON rank importer.
- Added support for handling incomplete or multiline NDJSON records in
CVEListV5Handler. - Introduced optional CVE v5 word indexing.
- Added missing-word tracking for CVE v5 imports.
- Split missing-word tracking into separate vendor and product sets for more precise analysis.
Faster and more flexible CPE imports
- Parallelized NVD CPE imports for improved performance.
- Refactored import logic into reusable handler classes.
- Added
NVDCPEHandlerfor importing the NVD CPE Dictionary 2.0 JSON format. - Extended import support for tar archives and standalone JSON files.
- Continued support for legacy XML imports through
XMLCPEHandler. - Added logging of JSON file names found inside tar archives.
- Expanded the import model so cpe-guesser can integrate CPE data from additional sources, including Vulnerability-Lookup dumps, instead of relying solely on NVD feeds.
Configuration and deployment improvements
- Improved configuration robustness by embedding default settings in code when configuration is missing or incomplete.
- Made the Valkey database number configurable.
- Fixed Docker deployment and
docker-composeconfiguration to use Valkey correctly. - Corrected
settings.yamlstructure issues. - Added missing requirements and improved script executability in
bin/.
Documentation and maintenance
- Updated README documentation.
- Added examples for the JSON format while keeping legacy format examples.
- Applied Black formatting across library code and regression/import tests.
- General linting and formatting cleanups.
Breaking / notable changes
- The project now defaults to the CPE Dictionary 2.0 feed.
- Import handling has been refactored significantly around dedicated handler classes.
- CLI import behavior was simplified by removing the redundant
--updateflag and improving boolean toggle handling. - The project architecture is now better suited for multi-source CPE ingestion, reducing dependence on NVD as the single upstream source.
Contributors
Thanks to everyone who contributed to this release, including:
- Alexandre Dulaunoy
- Esa Jokinen
- Surya Kanagasabapathy
Upgrade notes
When upgrading to 2.0, review:
- Your import workflows, especially if you rely on legacy XML-only behavior.
- Your configuration files, although defaults now make startup more robust.
- Your Docker and Valkey setup if you deploy with containers.
- Your data ingestion pipeline if you want to take advantage of alternative CPE sources such as the Vulnerability-Lookup dumps.
Summary
cpe-guesser 2.0 is a substantial release that modernizes the import pipeline, adds support for current NVD CPE data formats, improves ranking quality, and makes deployments more robust and scalable. It also opens the door to a more autonomous and flexible ingestion model by supporting additional CPE sources beyond NVD.