Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning †

被引:2
|
作者
Yelmen, Ilkay [1 ,2 ]
Gunes, Ali [1 ]
Zontul, Metin [3 ]
机构
[1] Istanbul Aydin Univ, Fac Engn, Dept Comp Engn, TR-34295 Istanbul, Turkiye
[2] Turkcell Grp Co Digital Educ Technol Inc, TR-06800 Ankara, Turkiye
[3] Sivas Sci & Technol Univ, Fac Engn & Nat Sci, Dept Comp Engn, TR-58100 Sivas, Turkiye
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 10期
关键词
document classification; multi-class classification; word embeddings; WordNet; BERT; TEXT CLASSIFICATION;
D O I
10.3390/app13106139
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With the recent growth of the Internet, the volume of data has also increased. In particular, the increase in the amount of unstructured data makes it difficult to manage data. Classification is also needed in order to be able to use the data for various purposes. Since it is difficult to manually classify the ever-increasing volume data for the purpose of various types of analysis and evaluation, automatic classification methods are needed. In addition, the performance of imbalanced and multi-class classification is a challenging task. As the number of classes increases, so does the number of decision boundaries a learning algorithm has to solve. Therefore, in this paper, an improvement model is proposed using WordNet lexical ontology and BERT to perform deeper learning on the features of text, thereby improving the classification effect of the model. It was observed that classification success increased when using WordNet 11 general lexicographer files based on synthesis sets, syntactic categories, and logical groupings. WordNet was used for feature dimension reduction. In experimental studies, word embedding methods were used without dimension reduction. Afterwards, Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) algorithms were employed to perform classification. These studies were then repeated with dimension reduction performed by WordNet. In addition to the machine learning model, experiments were also conducted with the pretrained BERT model with and without WordNet. The experimental results showed that, on an unstructured, seven-class, imbalanced dataset, the highest accuracy value of 93.77% was obtained when using our proposed model.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Ontology-based multi-classification learning for video concept detection
    Wu, Y
    Tseng, BL
    Smith, JR
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1003 - 1006
  • [22] Deep Graph Learning for DDoS Detection and Multi-Class Classification IDS
    Saunders, Braden J.
    De Grande, Robson E.
    Carvalho, Glaucio H. S.
    Woungang, Isaac
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2024, : 96 - 100
  • [23] Performance Improvement of Deep Learning Based Multi-Class ECG Classification Model Using Limited Medical Dataset
    Choi, Sanghoon
    Seo, Hyo-Chang
    Cho, Min Soo
    Joo, Segyeong
    Nam, Gi-Byoung
    IEEE ACCESS, 2023, 11 : 53185 - 53194
  • [24] Deep learning-based image classification for online multi-coal and multi-class sorting
    Liu, Yang
    Zhang, Zelin
    Liu, Xiang
    Wang, Lei
    Xia, Xuhui
    COMPUTERS & GEOSCIENCES, 2021, 157
  • [25] Lazy Learning for Multi-class Classification Using Genetic Programming
    Jabeen, Hajira
    Baig, Abdul Rauf
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 177 - +
  • [26] Multi-Class Object Classification using Deep Learning Models in Automotive Object Detection Scenarios
    Soumya, A.
    Cenkeramaddi, Linga Reddy
    Vishnu, Chalavadi
    Mohan, Krishna C.
    SIXTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION, ICMV 2023, 2024, 13072
  • [27] Using Lexical Chain in Ontology-Based Information Extraction
    Cong, Chunyu
    Gao, Rui
    Wang, Zhongying
    Meng, Xiao
    Proceedings of the 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016), 2016, 67 : 312 - 316
  • [28] Multi-Class Brain Lesion Classification Using Deep Transfer Learning With MobileNetV3
    Majeed, Ahmed Firas
    Salehpour, Pedram
    Farzinvash, Leili
    Pashazadeh, Saeid
    IEEE ACCESS, 2024, 12 : 155295 - 155308
  • [29] A Deep CNN based Multi-class Classification of Alzheimer's Disease using MRI
    Farooq, Ammarah
    Anwar, Syed Muhammad
    Awais, Muhammad
    Rehman, Saad
    2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST), 2017, : 111 - 116
  • [30] Research on the Optimization of Multi-Class Land Cover Classification Using Deep Learning with Multispectral Images
    Li, Yichuan
    Yu, Junchuan
    Wang, Ming
    Xie, Minying
    Xi, Laidian
    Pang, Yunxuan
    Hou, Changhong
    LAND, 2024, 13 (05)