Multi-Class Document Classification Using Lexical Ontology-Based Deep Learning †

被引:2
|
作者
Yelmen, Ilkay [1 ,2 ]
Gunes, Ali [1 ]
Zontul, Metin [3 ]
机构
[1] Istanbul Aydin Univ, Fac Engn, Dept Comp Engn, TR-34295 Istanbul, Turkiye
[2] Turkcell Grp Co Digital Educ Technol Inc, TR-06800 Ankara, Turkiye
[3] Sivas Sci & Technol Univ, Fac Engn & Nat Sci, Dept Comp Engn, TR-58100 Sivas, Turkiye
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 10期
关键词
document classification; multi-class classification; word embeddings; WordNet; BERT; TEXT CLASSIFICATION;
D O I
10.3390/app13106139
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With the recent growth of the Internet, the volume of data has also increased. In particular, the increase in the amount of unstructured data makes it difficult to manage data. Classification is also needed in order to be able to use the data for various purposes. Since it is difficult to manually classify the ever-increasing volume data for the purpose of various types of analysis and evaluation, automatic classification methods are needed. In addition, the performance of imbalanced and multi-class classification is a challenging task. As the number of classes increases, so does the number of decision boundaries a learning algorithm has to solve. Therefore, in this paper, an improvement model is proposed using WordNet lexical ontology and BERT to perform deeper learning on the features of text, thereby improving the classification effect of the model. It was observed that classification success increased when using WordNet 11 general lexicographer files based on synthesis sets, syntactic categories, and logical groupings. WordNet was used for feature dimension reduction. In experimental studies, word embedding methods were used without dimension reduction. Afterwards, Random Forest (RF), Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) algorithms were employed to perform classification. These studies were then repeated with dimension reduction performed by WordNet. In addition to the machine learning model, experiments were also conducted with the pretrained BERT model with and without WordNet. The experimental results showed that, on an unstructured, seven-class, imbalanced dataset, the highest accuracy value of 93.77% was obtained when using our proposed model.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Multi-Class Document Image Classification using Deep Visual and Textual Features
    Sevim, Semih
    Ekinci, Ekin
    Omurca, Sevinc Ilhan
    Edinc, Eren Berk
    Eken, Suleyman
    Erdem, Turkucan
    Sayar, Ahmet
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2022, 21 (02)
  • [2] Ontology-based MEDLINE document classification
    Camous, Fabrice
    Blott, Stephen
    Smeaton, Alan F.
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2007, 4414 : 439 - +
  • [3] A deep learning based architecture for multi-class skin cancer classification
    Mushtaq, Snowber
    Singh, Omkar
    Multimedia Tools and Applications, 2024, 83 (39) : 87105 - 87127
  • [4] Multi-Ideology Multi-Class Extremism Classification Using Deep Learning Techniques
    Gaikwad, Mayur
    Ahirrao, Swati
    Kotecha, Ketan
    Abraham, Ajith
    IEEE ACCESS, 2022, 10 : 104829 - 104843
  • [5] Multi-class Document Classification Using Improved Word Embeddings
    Rabut, Benedict A.
    Fajardo, Arnel C.
    Medina, Ruji P.
    2019 2ND INTERNATIONAL CONFERENCE ON COMPUTING AND BIG DATA (ICCBD 2019), 2019, : 42 - 46
  • [6] Multi-Class Retinopathy classification in Fundus Image using Deep Learning Approaches
    Wankhade, Nisha R.
    Bhoyar, Kishor K.
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2021, 12 (05): : 807 - 816
  • [7] DeepFood: Automatic Multi-Class Classification of Food Ingredients Using Deep Learning
    Pan, Lili
    Pouyanfar, Samira
    Chen, Hao
    Qin, Jiaohua
    Chen, Shu-Ching
    2017 IEEE 3RD INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2017, : 181 - 189
  • [8] An enhanced deep learning method for multi-class brain tumor classification using deep transfer learning
    Asif, Sohaib
    Zhao, Ming
    Tang, Fengxiao
    Zhu, Yusen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (20) : 31709 - 31736
  • [9] An enhanced deep learning method for multi-class brain tumor classification using deep transfer learning
    Sohaib Asif
    Ming Zhao
    Fengxiao Tang
    Yusen Zhu
    Multimedia Tools and Applications, 2023, 82 : 31709 - 31736
  • [10] Deep Sparse Representation Learning for Multi-class Image Classification
    Arya, Amit Soni
    Thakur, Shreyanshu
    Mukhopadhyay, Sushanta
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 218 - 227