Skip to main content

CVE-Classification

·148 words·1 min
Ian Jones
Author
Ian Jones

Purpose of this Project
#

There is a glut of CVEs that aren’t tagged properly when they’re written. We’re messy, we’re unorganized, and sometimes the translation of technical knowledge to someone who may not be as technical is left out.

In the interest of addressing this problem this project is scoped to classify CVEs so that they can be tagged with their proper platform (Windows, MacOS, Linux, Web, etc.) and as a future state additionally the type of vulnerability (SQL Injection, XSS, Privalege Escalation, etc.)

As a part of this project my goal is to identify the differences between a few classification methods, primary at the start it will be a direct comparison between term-frequency-inverse-document-frequency (TF-IDF) with logistic regression, compared with a Transformer model such as DistillBERT.

Datasets
#

As with training any model it requires a training-set and test-set of data, so I plan on training with