EXTRACTING EXPLOITS AND ATTACK VECTORS FROM CYBERSECURITY NEWS USING NLP

被引：0

作者：

Sandescu, Cristian ^{[1
]}

Dinisor, Alexandra ^{[2
]}

Vladescu, Cristina-Veronica ^{[3
]}

Grigorescu, Octavian ^{[4
]}

Corlatescu, Dragos ^{[3
]}

Dascalu, Mihai ^{[3
]}

Rughinis, Razvan ^{[3
]}

机构：

[1] CODA Intelligence SRL, Dept Software Dev, Timisoara, Romania

[2] Tech Univ Munich, Dept Informat, Munich, Germany

[3] Univ Politehn Bucuresti, Dept Comp Sci, Bucharest, Romania

[4] CODA Intelligence SRL, Software Dev, Timisoara, Romania

来源：

UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE | 2022年 / 84卷 / 02期

关键词：

Zero-days Attack; Exploit; Attack Vector; Entity Labeling; spaCy;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Cybersecurity has an immense impact on society as it enables the digital protection of individuals and enterprises against an increasing number of online threats. Moreover, the rate at which attackers discover and exploit critical vulnerabilities outperforms the vendors' capabilities to respond accordingly and provide security patches. As such, open-source intelligence data (OSINT) has become a valuable resource, from which details on zero-day vulnerabilities can be retrieved and timely actions can be taken before the patches become available. In this paper we propose a method to automatically label articles on vulnerabilities and cyberattacks from trusted sources. Using Named Entity Recognition, we extract essential information about new vulnerabilities, such as the exploit's public release and the environment in which the attack's exploitation is possible. Our balanced dataset contains 1095 samples out of which 250 entries are from cybersecurity articles; the rest of the articles were crawled and annotated from the U.S. Government's Vulnerability Database, whereas automated text augmentation techniques were also considered. Our model built on top of spaCy obtained an overall performance of 75% recall on the Exploit Available task. When considering the Attack Vector metric, the model achieved the following recalls: Network 72%, Local 78%, and Physical 92%.

引用

页码：63 / 78

页数：16

共 50 条

[1] EXTRACTING EXPLOITS AND ATTACK VECTORS FROM CYBERSECURITY NEWS USING NLP
Sandescu, Cristian
Dinisor, Alexandra
Vladescu, Cristina-Veronica
Grigorescu, Octavian
Corlatescu, Dragos
Dascalu, Mihai
Rughinis, Razvan
[J]. UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2022, 84 (02): : 63 - 78
[2] Protecting the cloud: Attack vectors and other exploits
Gold S.
[J]. Network Security, 2010, 2010 (12) : 10 - 12
[3] EXTRACTING KNOWLEDGE FROM ENGLISH TRANSLATED QURAN USING NLP PATTERN
Ismail, Rohana
Abu Bakar, Zainab
Abd Rahman, Nurazzah
[J]. JURNAL TEKNOLOGI-SCIENCES & ENGINEERING, 2015, 77 (19): : 67 - 73
[4] Extracting UML Class Diagrams from Software Requirements in Thai using NLP
Jaiwai, Mathawan
Sammapun, Usa
[J]. PROCEEDINGS OF 2017 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2017,
[5] Experimenting with extracting software requirements using NLP approach
Alkhader, Yara
Hudaib, Amjad
Hammo, Bassarn
[J]. 2006 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2007, : 349 - +
[6] Automatically Extracting Insurance Contract Knowledge Using NLP
Goossens, Alexandre
Berth, Laure
Decoene, Emilia
Van Veldhoven, Ziboud
Vanthienen, Jan
[J]. BUSINESS INFORMATION SYSTEMS WORKSHOPS, BIS 2021, 2022, 444 : 27 - 38
[7] Automatically Extracting Templates from Examples for NLP Tasks
Ong, Ethel
Hong, Bryan Anthony
Nunez, Vince Andrew
[J]. PACLIC 22: PROCEEDINGS OF THE 22ND PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2008, : 452 - 459
[8] Using the Cybersecurity Kill Chain for Attack and Defence
Cooper M.
[J]. ITNOW, 2022, 64 (02) : 38 - 41
[9] EXTRACTING VECTORS FROM RASTER IMAGES
PARKER, JR
[J]. COMPUTERS & GRAPHICS, 1988, 12 (01) : 75 - 79
[10] Extracting precise link context using NLP parsing technique
Xu, QY
Zuo, WL
[J]. IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 64 - 69

← 1 2 3 4 5 →