Japanese Text Classification by Character-level Deep ConvNets and Transfer Learning

被引:7
|
作者
Sato, Minato [1 ]
Orihara, Ryohei [1 ]
Sei, Yuichi [1 ]
Tahara, Yasuyuki [1 ]
Ohsuga, Akihiko [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat Syst, Tokyo, Japan
关键词
Deep Learning; Temporal ConvNets; Transfer Learning; Text Classification; Sentiment Analysis;
D O I
10.5220/0006193401750184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal (one-dimensional) Convolutional Neural Network ( Temporal CNN, ConvNet) is an emergent technology for text understanding. The input for the ConvNets could be either a sequence of words or a sequence of characters. In the latter case there are no needs for natural language processing that depends on a language such as morphological analysis. Past studies showed that the character-level ConvNets worked well for news category classification and sentiment analysis / classification tasks in English and romanized Chinese text corpus. In this article we apply the character-level ConvNets to Japanese text understanding. We also attempt to reuse meaningful representations that are learned in the ConvNets from a large-scale dataset in the form of transfer learning, inspired by its success in the field of image recognition. As for the application to the news category classification and the sentiment analysis and classification tasks in Japanese text corpus, the ConvNets outperformed N-gram-based classifiers. In addition, our ConvNets transfer learning frameworks worked well for a task which is similar to one used for pre-training.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 50 条
  • [1] Text Classification and Transfer Learning Based on Character-Level Deep Convolutional Neural Networks
    Sato, Minato
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART 2017), 2018, 10839 : 62 - 81
  • [2] A Compact Encoding for Efficient Character-level Deep Text Classification
    Marinho, Wemerson
    Marti, Luis
    Sanchez-Pi, Nayat
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [3] Character-level Convolutional Networks for Text Classification
    Zhang, Xiang
    Zhao, Junbo
    Yann Lecun
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [4] Character-level Neural Networks for Short Text Classification
    Liu, Jingxue
    Meng, Fanrong
    Zhou, Yong
    Liu, Bing
    [J]. 2017 INTERNATIONAL SMART CITIES CONFERENCE (ISC2), 2017,
  • [5] Chinese text classification based on character-level CNN and SVM
    Wu, Huaiguang
    Li, Daiyi
    Cheng, Ming
    [J]. International Journal of Intelligent Information and Database Systems, 2019, 12 (03) : 212 - 228
  • [6] A Character-Level Deep Lifelong Learning Model for Named Entity Recognition in Vietnamese Text
    Ngoc-Vu Nguyen
    Thi-Lan Nguyen
    Cam-Van Nguyen Thi
    Mai-Vu Tran
    Quang-Thuy Ha
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT I, 2019, 11431 : 90 - 102
  • [7] A CHINESE CHARACTER-LEVEL AND WORD-LEVEL COMPLEMENTARY TEXT CLASSIFICATION METHOD
    Chen, Wentong
    Fan, Chunxiao
    Wu, Yuexin
    Lou, Zhixiong
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2020), 2020, : 187 - 192
  • [8] Classification of Traditional Chinese Medicine Cases based on Character-level Bert and Deep Learning
    Song, Zihao
    Xie, Yonghong
    Huang, Wen
    Wang, Haoyu
    [J]. PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 1383 - 1387
  • [9] Offensive Sentence Classification Using Character-Level CNN and Transfer Learning with Fake Sentences
    Seo, Suin
    Cho, Sung-Bea
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 532 - 539
  • [10] Joint Character-Level Convolutional and Generative Adversarial Networks for Text Classification
    Wang, Tianshi
    Liu, Li
    Zhang, Huaxiang
    Zhang, Long
    Chen, Xiuxiu
    [J]. COMPLEXITY, 2020, 2020