Predicting Vulnerable Software Components through N-gram Analysis and Statistical Feature Selection

被引:48
|
作者
Pang, Yulei [1 ]
Xue, Xiaozhen [2 ]
Namin, Akbar Siami [2 ]
机构
[1] Southern Connecticut State Univ, Dept Math, New Haven, CT 06515 USA
[2] Texas Tech Univ, Dept Comp Sci, Lubbock, TX 79409 USA
关键词
Vulnerability prediction; N-gram; Feature selection; Wilcoxon test;
D O I
10.1109/ICMLA.2015.99
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Vulnerabilities need to be detected and removed from software. Although previous studies demonstrated the usefulness of employing prediction techniques in deciding about vulnerabilities of software components, the accuracy and improvement of effectiveness of these prediction techniques is still a grand challenging research question. This paper proposes a hybrid technique based on combining N-gram analysis and feature selection algorithms for predicting vulnerable software components where features are defined as continuous sequences of token in source code files, i.e., Java class file. Machine learning-based feature selection algorithms are then employed to reduce the feature and search space. We evaluated the proposed technique based on some Java Android applications, and the results demonstrated that the proposed technique could predict vulnerable classes, i.e., software components, with high precision, accuracy and recall.
引用
收藏
页码:543 / 548
页数:6
相关论文
共 50 条
  • [21] N-gram Analysis of a Mongolian Text
    Altangerel, Khuder
    Tsend, Ganbat
    Jalsan, Khash-Erdene
    [J]. IFOST 2008: PROCEEDING OF THE THIRD INTERNATIONAL FORUM ON STRATEGIC TECHNOLOGIES, 2008, : 258 - 259
  • [22] N-GRAM ANALYSIS IN THE ENGINEERING DOMAIN
    Leary, Martin
    Pearson, Geoff
    Burvill, Colin
    Mazur, Maciej
    Subic, Aleksandar
    [J]. PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN (ICED 11): IMPACTING SOCIETY THROUGH ENGINEERING DESIGN, VOL 6: DESIGN INFORMATION AND KNOWLEDGE, 2011, 6 : 414 - 423
  • [23] NG_MDERANK: A software vulnerability feature knowledge extraction method based on N-gram similarity
    Wu, Xiaoxue
    Weng, Shiyu
    Zheng, Bin
    Zheng, Wei
    Chen, Xiang
    Sun, Xiaobin
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024,
  • [24] Research on Chinese N-gram Statistical Rule and its Application
    Yin, Zhaoming
    Zhang, Huarui
    [J]. PROCEEDINGS OF THE 14TH YOUTH CONFERENCE ON COMMUNICATION, 2009, : 523 - +
  • [25] CyberNet: a hybrid deep CNN with N-gram feature selection for cyberbullying detection in online social networks
    Vijaya Lakshmi Paruchuri
    P. Rajesh
    [J]. Evolutionary Intelligence, 2023, 16 : 1935 - 1949
  • [26] STATISTICAL N-GRAM INDEXING OF NATURAL-LANGUAGE DOCUMENTS
    TEUFEL, B
    [J]. INTERNATIONAL FORUM ON INFORMATION AND DOCUMENTATION, 1988, 13 (04): : 3 - 10
  • [27] CyberNet: a hybrid deep CNN with N-gram feature selection for cyberbullying detection in online social networks
    Paruchuri, Vijaya Lakshmi
    Rajesh, P.
    [J]. EVOLUTIONARY INTELLIGENCE, 2023, 16 (06) : 1935 - 1949
  • [28] Amyloidogenic motifs revealed by n-gram analysis
    Michał Burdukiewicz
    Piotr Sobczyk
    Stefan Rödiger
    Anna Duda-Madej
    Paweł Mackiewicz
    Małgorzata Kotulska
    [J]. Scientific Reports, 7
  • [29] Applications of Boolean equations in n-gram analysis
    Marovac, Ulfeta
    [J]. ICIST '18: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES, 2018,
  • [30] N-gram analysis for computer virus detection
    Reddy, D. Krishna Sandeep
    Pujari, Arun K.
    [J]. JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES, 2006, 2 (03): : 231 - 239