Deep Learning for Hindi Text Classification: A Comparison

被引:9
|
作者
Joshi, Ramchandra [1 ]
Goel, Purvi [2 ]
Joshi, Raviraj [2 ]
机构
[1] Pune Inst Comp Technol, Dept Comp Engn, Pune, Maharashtra, India
[2] Indian Inst Technol Madras, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
关键词
Natural language processing; Convolutional neural networks; Recurrent neural networks; Sentence embedding; Hindi text classification;
D O I
10.1007/978-3-030-44689-5_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) and especially natural language text analysis have seen great advances in recent times. Usage of deep learning in text processing has revolutionized the techniques for text processing and achieved remarkable results. Different deep learning architectures like CNN, LSTM, and very recent Transformer have been used to achieve state of the art results variety on NLP tasks. In this work, we survey a host of deep learning architectures for text classification tasks. The work is specifically concerned with the classification of Hindi text. The research in the classification of morphologically rich and low resource Hindi language written in Devanagari script has been limited due to the absence of large labeled corpus. In this work, we used translated versions of English data-sets to evaluate models based on CNN, LSTM and Attention. Multilingual pre-trained sentence embeddings based on BERT and LASER are also compared to evaluate their effectiveness for the Hindi language. The paper also serves as a tutorial for popular text classification techniques.
引用
收藏
页码:94 / 101
页数:8
相关论文
共 50 条
  • [1] Evaluation of Deep Learning Models for Hostility Detection in Hindi Text
    Joshi, Ramchandra
    Karnavat, Rushabh
    Jirapure, Kaustubh
    Joshi, Ravirai
    [J]. 2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [2] Deep Active Learning for Text Classification
    An, Bang
    Wu, Wenjun
    Han, Huimin
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
  • [3] Handwritten Digit Classification in Bangla and Hindi Using Deep Learning
    Mukhoti, Jishnu
    Dutta, Sukanya
    Sarkar, Ram
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (14) : 1074 - 1099
  • [4] Effect of Stemming on Hindi Text Classification
    Pimpalshende, Anjusha
    Singh, Preety
    Potnurwar, Archana
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 208 - 215
  • [5] Ensemble of deep learning and machine learning approach for classification of handwritten Hindi numerals
    Rajpal D.
    Garg A.R.
    [J]. Journal of Engineering and Applied Science, 2023, 70 (01):
  • [6] Hindi EmotionNet: A Scalable Emotion Lexicon for Sentiment Classification of Hindi Text
    Garg, Kanika
    Lobiyal, D. K.
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (04)
  • [7] A Hybrid Deep Learning Model for Text Classification
    Chen, Xianglong
    Ouyang, Chunping
    Liu, Yongbin
    Luo, Lingyun
    Yang, Xiaohua
    [J]. 2018 14TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2018, : 46 - 52
  • [8] Applications of Deep Learning in News Text Classification
    Zhang, Menghan
    [J]. SCIENTIFIC PROGRAMMING, 2021, 2021
  • [9] Review of text classification methods on deep learning
    Wu, Hongping
    Liu, Yuling
    Wang, Jingwen
    [J]. Computers, Materials and Continua, 2020, 63 (03): : 1309 - 1321
  • [10] A Deep Learning Approach for Arabic Text Classification
    Sundus, Katrina
    Al-Haj, Fatima
    Hammo, Bassam
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 258 - 264