Subsequence Kernels-Based Arabic Text Classification

被引:0
|
作者
Nehar, Attia [2 ]
Benmessaoud, Abdelkader [1 ]
Cherroun, Hadda [1 ]
Ziadi, Djelloul [3 ]
机构
[1] Univ Amar Telidji, Lab Informat & Math, Laghouat, Algeria
[2] Univ Ziane Achour, Djelfa, Algeria
[3] Normandie Univ, Lab LITIS, EA 4108, Rouen, France
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Kernel methods have known huge success in machine learning. This success is mainly due to their flexibility to deal with high dimensionality of the feature space of complex data such as graphs, trees or textual data. In the field of text classification (TC) their performances have supplanted traditional algorithms. For textual data, different kernels were introduced (P-spectrum, AII-Sub-sequences, Gap-Weighted Subsequences kernel,...) to improve the performance of TC systems. In this paper, we carried out a system for Arabic TC which supports aspects of order and co-occurrence of words within a text. Transducers, specific automata, are used to represent documents. Such representation allows an efficient implementation of subsequence kernel. An empirical study is conducted to evaluate the ATC system on the large SPA corpus. Results show an improvement of the classification in terms of precision.
引用
收藏
页码:206 / 213
页数:8
相关论文
共 50 条
  • [21] Text classification and gradation in Arabic textbooks
    Mohamed, Salwa
    LANGUAGE LEARNING JOURNAL, 2024, 52 (06): : 629 - 649
  • [22] Arabic Text Classification: New study
    Ayed, Rabii
    Labidi, Mohamed
    Maraoui, Mohsen
    2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
  • [23] Semantic kernels for text classification based on topological measures of feature similarity
    Bloehdorn, Stephan
    Basili, Roberto
    Cammisa, Marco
    Moschitti, Alessandro
    ICDM 2006: Sixth International Conference on Data Mining, Proceedings, 2006, : 808 - 812
  • [24] Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study
    Raho, Ghazi
    Al-Shalabi, Riyad
    Kanaan, Ghassan
    Asma'aNassar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 192 - 195
  • [25] Subsequence-based Text Segmentation and Labeling
    Chen, Xi
    Chen, Shihong
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL I, 2009, : 582 - 587
  • [26] Investigating the Relevance of Arabic Text Classification Datasets Based on Supervised Learning
    Ahmad Hussein Ababneh
    Journal of Electronic Science and Technology, 2022, (02) : 187 - 208
  • [27] Multi-Label Arabic Text Classification Based On Deep Learning
    Alsukhni, Batool
    2021 12TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2021, : 475 - 477
  • [28] Arabic Synonym BERT-based Adversarial Examples for Text Classification
    Alshahrani, Norah
    Alshahrani, Saied
    Wali, Esma
    Matthews, Jeanna
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: STUDENT RESEARCH WORKSHOP, 2024, : 137 - 147
  • [29] Investigating the Relevance of Arabic Text Classification Datasets Based on Supervised Learning
    Ahmad Hussein Ababneh
    Journal of Electronic Science and Technology, 2022, 20 (02) : 187 - 208
  • [30] Investigating the Relevance of Arabic Text Classification Datasets Based on Supervised Learning
    Ababneh A.H.
    Journal of Electronic Science and Technology, 2022, 20 (02) : 187 - 208