The National Institute of Standards and Technology is exploring how automation could be used for vulnerability management to improve the process for describing cybersecurity weaknesses, according to an agency official leading an effort to build resources around a new framework.
“When you're developing an AI system, you need to train models. To train the models, you need to have comprehensively labeled data,” NIST computer scientist Irena Bojanova told Inside Cybersecurity. Bojanova is the author of NIST’s “Bugs Framework” released on Tuesday.
Bojanova started working on the vulnerability-focused framework at NIST in 2014. The framework details a formal method for improving vulnerability descriptions in terms of precision. The project specifically aims to enhance vulnerability management in the Common Weakness Enumeration and Common Vulnerabilities and Exposures lists, which are run by MITRE.
The initial framework will be followed by eight additional publications outlining specific features of the framework, including security concepts, models for different types of security bugs, weakness taxonomy, visual modeling of vulnerabilities and formalized principles for secure coding.
Bojanova emphasized the importance of providing AI models intended for use in vulnerability disclosure and vulnerability management with “well-defined vulnerability descriptions” throughout the model training process.
She said the “formal specification” outlined in the new framework provides the “comprehensively labeled weakness and vulnerability datasets that are needed to train the AI models.”
The CWE and CVE lists are less well-suited to train AI models because they are built on “natural language descriptions” that can be unclear or “incomplete,” Bojanova said.
The goal of the framework is to create something “more comprehensive” and “better organized” than the CVE and CWE databases, Bojanova said, by ensuring precise communication through a “more detailed and multi-dimensional structure.”
To eliminate ambiguity, the framework offers a new mathematical language for vulnerability management that has its own set of “formal” grammar rules and “semantic rules of what a meaningful sentence looks like,” according to Bojanova.
Bojanova said the formal language offered in the framework is designed to outline the cause-and-effect relationship within weaknesses, between weaknesses and between vulnerabilities in a way that is “unambiguous.”
The language was built to avoid “synonyms” and “vague” uses of similar terms, Bojanova said, noting these features increase the usefulness of data for AI training.
The language in the framework represents individual weaknesses in units of “triples,” which consist of three factors: a “cause,” an impacted “operation” and a “consequence.” Causes are bugs or faults in the system, operations are specific system functions affected by the cause and consequences are the results of the flawed operation, according to the publication.
These triples can be combined to describe a “chain of weaknesses” that leads to an exploitable error, Bojanova said. A consequence of one weakness triple “becomes the fault for the next weakness,” Bojanova explained, “until eventually a final error is reached.”
The framework uses the language to create “specifications” of vulnerabilities that can serve as “well-defined labels” in datasets based on CVE and CWE entries.
These specifications can be used to improve the CVE and CWE lists, Bojanova says, while also providing downstream enhancements to NIST’s National Vulnerability Database and the Cybersecurity and Infrastructure Security Agency’s Known Exploited Vulnerabilities catalog, which use CVE and CWE data.
AI models for vulnerability disclosure or vulnerability management could also be trained “via the machine-readable representations of the BF taxonomies, models, and the BFCWE and BFCVE datasets,” Bojanova explained.
The bug models will be further explored in one of the upcoming eight publications under the project. Another publication will more clearly guide security professionals in the application of secure coding principles based on formal methods. -- Jacob Livesay (jlivesay@iwpnews.com)