n-BiLSTM: BiLSTM with n-gram Features for Text Classification

被引:0
|
作者
Zhang, Yunxiang [1 ]
Rao, Zhuyi [1 ]
机构
[1] Shenzhen Power Supply Bur Co Ltd, Shenzhen, Peoples R China
关键词
text classification; n-gramm; bidirectional long short-term memory; deep learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text classification is widely existing in the fields of e-commerce and log message analysis. Besides, it is an essential module in text processing tasks. In this paper, we present a method to create an accurate and fast text classification system in both One-vs.-one and One-vs.-rest manner. Our approach, named n-BiLSTM, is used to convert natural text sentences into features similar to bag-of-words with n-gram techniques, and then the features are fed into a bidirectional LSTM. The two components are able to take better advantages of multi-scale feature representation and context information. Finally, the whole system is evaluated using two labeled movie review datasets, IMDB and SSTb, to test one-vs.-one and one-vs.-rest performances respectively. The results obtained show that our n-BiLSTM algorithm is superior to the basic LSTM and bidirectional LSTM algorithms.
引用
收藏
页码:1056 / 1059
页数:4
相关论文
共 50 条
  • [41] Character N-Gram Tokenization for European Language Text Retrieval
    Paul McNamee
    James Mayfield
    [J]. Information Retrieval, 2004, 7 : 73 - 97
  • [42] The textcat Package for n-Gram Based Text Categorization in R
    Hornik, Kurt
    Mair, Patrick
    Rauch, Johannes
    Geiger, Wilhelm
    Buchta, Christian
    Feinerer, Ingo
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2013, 52 (06):
  • [43] N-gram language models for offline handwritten text recognition
    Zimmermann, M
    Bunke, H
    [J]. NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 203 - 208
  • [44] N-gram and local context analysis for Persian text retrieval
    Aleahmad, Abolfazl
    Hakimian, Parsia
    Mahdikhani, Farzad
    Oroumchian, Farhad
    [J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 284 - 287
  • [45] N-gram based approach for opinion mining of Punjabi text
    Kaur, Amandeep
    Gupta, Vishal
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8875 : 81 - 88
  • [46] Japanese text classification using N-gram and the maximum ratio of term frequency among categories
    Suzuki, Makoto
    [J]. PROCEDINGS OF THE 11TH IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, 2007, : 197 - 202
  • [47] Character N-gram tokenization for European language text retrieval
    McNamee, P
    Mayfield, J
    [J]. INFORMATION RETRIEVAL, 2004, 7 (1-2): : 73 - 97
  • [48] EXPERIMENTS IN TEXT RECOGNITION WITH BINARY N-GRAM AND VITERBI ALGORITHMS
    HULL, JJ
    SRIHARI, SN
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1982, 4 (05) : 520 - 530
  • [49] An Evaluation of Character Level N-gram Termsets in Text Categorization
    Coban, Onder
    Ozel, Selma Ayse
    [J]. 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [50] Language Identification of Short Text Segments with N-gram Models
    Vatanen, Tommi
    Vayrynen, Jaakko J.
    Virpioja, Sami
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3423 - 3430