Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning †

被引:2
|
作者
Yelmen, Ilkay [1 ,2 ]
Gunes, Ali [1 ]
Zontul, Metin [3 ]
机构
[1] Istanbul Aydin Univ, Fac Engn, Dept Comp Engn, TR-34295 Istanbul, Turkiye
[2] Turkcell Grp Co Digital Educ Technol Inc, TR-06800 Ankara, Turkiye
[3] Sivas Sci & Technol Univ, Fac Engn & Nat Sci, Dept Comp Engn, TR-58100 Sivas, Turkiye
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 10期
关键词
document classification; multi-class classification; word embeddings; WordNet; BERT; TEXT CLASSIFICATION;
D O I
10.3390/app13106139
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With the recent growth of the Internet, the volume of data has also increased. In particular, the increase in the amount of unstructured data makes it difficult to manage data. Classification is also needed in order to be able to use the data for various purposes. Since it is difficult to manually classify the ever-increasing volume data for the purpose of various types of analysis and evaluation, automatic classification methods are needed. In addition, the performance of imbalanced and multi-class classification is a challenging task. As the number of classes increases, so does the number of decision boundaries a learning algorithm has to solve. Therefore, in this paper, an improvement model is proposed using WordNet lexical ontology and BERT to perform deeper learning on the features of text, thereby improving the classification effect of the model. It was observed that classification success increased when using WordNet 11 general lexicographer files based on synthesis sets, syntactic categories, and logical groupings. WordNet was used for feature dimension reduction. In experimental studies, word embedding methods were used without dimension reduction. Afterwards, Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) algorithms were employed to perform classification. These studies were then repeated with dimension reduction performed by WordNet. In addition to the machine learning model, experiments were also conducted with the pretrained BERT model with and without WordNet. The experimental results showed that, on an unstructured, seven-class, imbalanced dataset, the highest accuracy value of 93.77% was obtained when using our proposed model.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Detection and Multi-Class Classification of Invasive Knotweeds with Drones and Deep Learning Models
    Valicharla, Sruthi Keerthi
    Karimzadeh, Roghaiyeh
    Naharki, Kushal
    Li, Xin
    Park, Yong-Lak
    DRONES, 2024, 8 (07)
  • [42] A Novel Fused Multi-Class Deep Learning Approach for Chronic Wounds Classification
    Aldoulah, Zaid A.
    Malik, Hafiz
    Molyet, Richard
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [43] ERM learning algorithm for multi-class classification
    Wang, Cheng
    Guo, Zheng-Chu
    APPLICABLE ANALYSIS, 2012, 91 (07) : 1339 - 1349
  • [44] Progressive Learning Strategies for Multi-class Classification
    Er, Meng Joo
    Venkatesan, Rajasekar
    Wang, Ning
    Chien, Chiang-Ju
    2017 INTERNATIONAL AUTOMATIC CONTROL CONFERENCE (CACS), 2017,
  • [45] Multi-Class Active Learning for Image Classification
    Joshi, Ajay J.
    Porikli, Fatih
    Papanikolopoulos, Nikolaos
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2364 - +
  • [46] Combining Deep Learning and Multi-Class Discriminant Analysis for Granite Tiles Classification
    Filisbino, Tiene A.
    Simao, Lucas B.
    Giraldi, Gilson A.
    Thomaz, Carlos Eduardo
    2017 WORKSHOP OF COMPUTER VISION (WVC), 2017, : 19 - 24
  • [47] An active learning algorithm for multi-class classification
    Dongjiang Liu
    Yanbi Liu
    Pattern Analysis and Applications, 2019, 22 : 1051 - 1063
  • [48] Automated ontology-based annotation of scientific literature using deep learning
    Manda, Prashanti
    SayedAhmed, Saed
    Mohanty, Somya D.
    PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON SEMANTIC BIG DATA (SBD 2020), 2020,
  • [49] MULTI-CLASS BRAIN TUMOR CLASSIFICATION AND SEGMENTATION USING HYBRID DEEP LEARNING NETWORK (HDLN) MODEL
    Kumar, Parasa Rishi
    Bonthu, Kavya
    Meghana, Boyapati
    Vani, Koneru Suvarna
    Chakrabarti, Prasun
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2023, 24 (01): : 69 - 80
  • [50] Xception-Fractalnet: Hybrid Deep Learning Based Multi-Class Classification of Alzheimer's Disease
    Aparna, Mudiyala
    Rao, Battula Srinivasa
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 6909 - 6932