Analyzing CVE Database Using Unsupervised Topic Modelling

被引:7
|
作者
Vanamala, Mounika [1 ]
Yuan, Xiaohong [1 ]
Bandaru, Kanishka [2 ]
机构
[1] North Carolina A&T State Univ, Dept Comp Sci, Greensboro, NC 27411 USA
[2] Birla Inst Technol & Sci, Comp Sci Engn, Hyderabad, India
关键词
Probabilistic Topic Modeling; Latent Dirichlet Allocation; Topic Modelling; CVE; OWASP;
D O I
10.1109/CSCI49370.2019.00019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes our study of the vulnerability reports in the Common Vulnerability and Exposures (CVE) database by using topic modeling on the description texts of the vulnerabilities. Prevalent vulnerability types were found, and new trends of vulnerabilities were discovered by studying the 121,716 unique CVE entries that are reported from January 1999 to July 2019. The topics found through topic modeling were mapped to OWASP Top 10 vulnerabilities. It was found that the OWASP vulnerabilities A2: 2017-Broken Authentication, A4:2017-XML External Entities (XXE), and A5:2017-Broken Access Control increased, yet the vulnerability A7:2017-Cross-Site Scripting (XSS) had a steep decrease over the period of 20 years.
引用
收藏
页码:72 / 77
页数:6
相关论文
共 50 条
  • [41] Using Topic Modelling Algorithms for Hierarchical Activity Discovery
    Rogers, Eoin
    Kelleher, John D.
    Ross, Robert J.
    AMBIENT INTELLIGENCE - SOFTWARE AND APPLICATIONS (ISAMI 2016), 2016, 476 : 41 - 48
  • [42] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [43] Unsupervised Document Classification and Topic Detection
    Novotny, Jaromir
    Ircing, Pavel
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 748 - 756
  • [44] Topic Models for Unsupervised Cluster Matching
    Iwata, Tomoharu
    Hirao, Tsutomu
    Ueda, Naonori
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (04) : 786 - 795
  • [45] Unsupervised Topic Discovery in User Comments
    Stanik, Christoph
    Pietz, Tim
    Maalej, Walid
    29TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE 2021), 2021, : 150 - 161
  • [46] Analyzing Topic Drift in Query Expansion for Information Retrieval from a Large-scale Patent DataBase
    Al-Shboul, Bashar
    Myaeng, Sung-Hyon
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 177 - +
  • [47] Unsupervised Inflection Generation Using Neural Language Modelling
    Sulea, Octavia-Maria
    Young, Steve
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT I, 2019, 11506 : 668 - 678
  • [48] Unsupervised texture segmentation using multispectral modelling approach
    Haindl, Michal
    Mikes, Stanislav
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 203 - +
  • [49] Analyzing Geographic Questions Using Embedding-based Topic Modeling
    Yang, Jonghyeon
    Jang, Hanme
    Yu, Kiyun
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (02)
  • [50] Analyzing entities and topics in news articles using statistical topic models
    Newman, David
    Chemudugunta, Chaitanya
    Smyth, Padhraic
    Steyvers, Mark
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 93 - 104