Leveraging the meta-embedding for text classification in a resource-constrained language

被引:14
|
作者
Hossain, Md. Rajib [1 ]
Hoque, Mohammed Moshiul [1 ]
Siddique, Nazmul [2 ]
机构
[1] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chittagong 4349, Bangladesh
[2] Ulster Univ, Sch Comp Engn & Intelligent Syst, Belfast, North Ireland
关键词
Natural language processing; Text classification; Text corpora; Semantic feature extraction; Meta-embedding; Deep learning;
D O I
10.1016/j.engappai.2023.106586
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes an intelligent text classification framework for a resource-constrained language like Bengali, which is considered a challenging task due to the lack of standard corpora, appropriate hyper-parameter tuning method, and pre-trained language-specific embedding. The proposed framework comprises an average meta-embedding feature fusion module and a convolutions neural network module called AVG-M+CNN. This work also proposes an algorithm, i.e., automatic hyperparameter tuning and selection, for enhancing the performance of the AVG-M+CN N technique. A l l meta-embedding models are evaluated using the intrinsic, e.g., semantic, syntactic, relatedness word similarity, analog y tasks and extrinsic evaluators. The intrinsic evaluator evaluates 200 Bengali semantic, syntactic and relatedness word pairs. Spearman (o), Pearson (?) and cosine similarity correlations are used to evaluate 18 individual embedding and 9 meta-embedding models. The 3COSADD and 3COSMU L evaluators evaluate the 300 analog y tasks. The extrinsic evaluator evaluates a total of 156 classification models on four corpora: BARD, IndicNLP, Prothom-Alo and BTCC 11 (a newly developed corpus having eleven distinct categories). Among these, the AVG-M+CN N model achieves the highest accuracy regarding four Bengal i corpora: 95.92 & PLUSMN;.001% for BARD, 93.10 & PLUSMN;.001% for Prothom-Alo, 90.07 & PLUSMN;.001% for BTCC 11 and 87.44 & PLUSMN;.001% for IndicNLP, respectively.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Task-oriented Domain-specific Meta-Embedding for Text Classification
    Wu, Xin
    Cai, Yi
    Li, Qing
    Wang, Tao
    Yang, Kai
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3508 - 3513
  • [2] Learning Similarity-Preserving Meta-Embedding for Text Mining
    Thadajarassiri, Jidapa
    Sen, Cansu
    Hartvigsen, Thomas
    Kong, Xiangnan
    Rundensteiner, Elke
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 808 - 817
  • [3] Resource-Constrained Binary Image Classification
    Park, Sean
    Wicker, Jorg
    Dost, Katharina
    DISCOVERY SCIENCE, DS 2024, PT II, 2025, 15244 : 215 - 230
  • [4] A Novel Approach for Classification in Resource-Constrained Environments
    Kumar, Arun
    Wang, Zhijie
    Srivastava, Abhishek
    ACM TRANSACTIONS ON INTERNET OF THINGS, 2022, 3 (04):
  • [5] Runtime Classification of Mobile Malware for Resource-Constrained Devices
    Milosevic, Jelena
    Malek, Miroslaw
    Ferrante, Alberto
    E-BUSINESS AND TELECOMMUNICATIONS (ICETE 2016), 2017, 764 : 195 - 215
  • [6] WiMesh: leveraging mesh networking for disaster communication in resource-constrained settings
    Usman Ashraf
    Amir Khwaja
    Junaid Qadir
    Stefano Avallone
    Chau Yuen
    Wireless Networks, 2021, 27 : 2785 - 2812
  • [7] WiMesh: leveraging mesh networking for disaster communication in resource-constrained settings
    Ashraf, Usman
    Khwaja, Amir
    Qadir, Junaid
    Avallone, Stefano
    Yuen, Chau
    WIRELESS NETWORKS, 2021, 27 (04) : 2785 - 2812
  • [8] Resource-Constrained Target Classification on Distant Aerial Targets
    Speranza, Nicholas A.
    Rave, Christopher J.
    Pei, Yong
    17TH ANNUAL INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS 2021), 2021, : 63 - 65
  • [9] Resource-constrained project scheduling: Notation, classification, models, and methods
    Brucker, P
    Drexl, A
    Mohring, R
    Neumann, K
    Pesch, E
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1999, 112 (01) : 3 - 41
  • [10] Energy-Efficient Classification for Resource-Constrained Biomedical Applications
    Shoaran, Mahsa
    Haghi, Benyamin Allahgholizadeh
    Taghavi, Milad
    Farivar, Masoud
    Emami, Azita
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2018, 8 (04) : 693 - 707