Subsequence Kernels-Based Arabic Text Classification

被引:0
|
作者
Nehar, Attia [2 ]
Benmessaoud, Abdelkader [1 ]
Cherroun, Hadda [1 ]
Ziadi, Djelloul [3 ]
机构
[1] Univ Amar Telidji, Lab Informat & Math, Laghouat, Algeria
[2] Univ Ziane Achour, Djelfa, Algeria
[3] Normandie Univ, Lab LITIS, EA 4108, Rouen, France
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Kernel methods have known huge success in machine learning. This success is mainly due to their flexibility to deal with high dimensionality of the feature space of complex data such as graphs, trees or textual data. In the field of text classification (TC) their performances have supplanted traditional algorithms. For textual data, different kernels were introduced (P-spectrum, AII-Sub-sequences, Gap-Weighted Subsequences kernel,...) to improve the performance of TC systems. In this paper, we carried out a system for Arabic TC which supports aspects of order and co-occurrence of words within a text. Transducers, specific automata, are used to represent documents. Such representation allows an efficient implementation of subsequence kernel. An empirical study is conducted to evaluate the ATC system on the large SPA corpus. Results show an improvement of the classification in terms of precision.
引用
收藏
页码:206 / 213
页数:8
相关论文
共 50 条
  • [41] Effect of Word Segmentation on Arabic Text Classification
    Al-Thubaity, Abdulmohsen
    Al-Subaie, Abdullah
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2015, : 127 - 131
  • [42] A Deep Learning Approach for Arabic Text Classification
    Sundus, Katrina
    Al-Haj, Fatima
    Hammo, Bassam
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 258 - 264
  • [43] NADA: New Arabic Dataset for Text Classification
    Alalyani, Nada
    Marie-Sainte, Souad Larabi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 206 - 212
  • [44] Evaluating Various Tokenizers for Arabic Text Classification
    Zaid Alyafeai
    Maged S. Al-shaibani
    Mustafa Ghaleb
    Irfan Ahmad
    Neural Processing Letters, 2023, 55 : 2911 - 2933
  • [45] Evaluating Various Tokenizers for Arabic Text Classification
    Alyafeai, Zaid
    Al-shaibani, Maged S.
    Ghaleb, Mustafa
    Ahmad, Irfan
    NEURAL PROCESSING LETTERS, 2023, 55 (03) : 2911 - 2933
  • [46] Named entity recognition and classification for text in arabic
    Abuleil, S
    Evens, M
    INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 89 - 94
  • [47] THE SUBSEQUENCE GRAPH OF A TEXT
    BAEZAYATES, RA
    LECTURE NOTES IN COMPUTER SCIENCE, 1989, 351 : 104 - 118
  • [48] Volterra's kernels-based finite-time parameters estimation of the Chua system
    Fedele, Giuseppe
    D'Alfonso, Luigi
    Pin, Gilberto
    Parisini, Thomas
    APPLIED MATHEMATICS AND COMPUTATION, 2018, 318 : 121 - 130
  • [49] Entity Relation Extraction from Geological Text using Conditional Random Fields and Subsequence Kernels
    Sobhana, N., V
    Ghosh, S. K.
    Mitra, Pabitra
    2012 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2012, : 832 - 840
  • [50] Convolutional Deep Belief Network Based Short Text Classification on Arabic Corpus
    Motwakel A.
    Al-Onazi B.B.
    Alzahrani J.S.
    Marzouk R.
    Aziz A.S.A.
    Zamani A.S.
    Yaseen I.
    Abdelmageed A.A.
    Computer Systems Science and Engineering, 2023, 45 (03): : 3097 - 3113