Neural Networks for Multi-lingual Multi-label Document Classification

被引:0
|
作者
Martinek, Jiri [1 ,2 ]
Lenc, Ladislav [1 ,2 ]
Kral, Pavel [1 ,2 ]
机构
[1] Univ West Bohemia, Dept Comp Sci & Engn, Fac Sci Appl, Plzen, Czech Republic
[2] Univ West Bohemia, Fac Sci Appl, NTIS, Plzen, Czech Republic
关键词
Convolutional neural network; CNN; Document classification; Multi-label; Multi-lingual;
D O I
10.1007/978-3-030-01418-6_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel approach for multi-lingual multi-label document classification based on neural networks. We use popular convolutional neural networks for this task with three different configurations. The first one uses static word2vec embeddings that are let as is, while the second one initializes it with word2vec and fine-tunes the embeddings while learning on the available data. The last method initializes embeddings randomly and then they are optimized to the classification task. The proposed method is evaluated on four languages, namely English, German, Spanish and Italian from the Reuters corpus. Experimental results show that the proposed approach is efficient and the best obtained F-measure reaches 84%.
引用
收藏
页码:73 / 83
页数:11
相关论文
共 50 条
  • [1] Combination of Neural Networks for Multi-label Document Classification
    Lenc, Ladislav
    Kral, Pavel
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 278 - 282
  • [2] MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
    Chalkidis, Ilias
    Fergadiotis, Manos
    Androutsopoulos, Ion
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6974 - 6996
  • [3] Deep Neural Networks for Czech Multi-label Document Classification
    Lenc, Ladislav
    Kral, Pavel
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 460 - 471
  • [4] Multi-lingual Emotion Classification Using Convolutional Neural Networks
    Iliev, Alexander
    Mote, Ameya
    Manoharan, Arjun
    [J]. LARGE-SCALE SCIENTIFIC COMPUTING (LSSC 2021), 2022, 13127 : 456 - 463
  • [5] Multi-label Classification with ART Neural Networks
    Sapozhnikova, Elena P.
    [J]. WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 144 - 147
  • [6] Multi-label Scientific Document Classification
    Ali, Tariq
    Asghar, Sohail
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2018, 19 (06): : 1707 - 1716
  • [7] Multi-label Document Classification in Czech
    Hrala, Michal
    Kral, Pavel
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 343 - 351
  • [8] Comparison of Representations of Named Entities for Multi-label Document Classification with Convolutional Neural Networks
    Pivovarova, Lidia
    Yangarber, Roman
    [J]. REPRESENTATION LEARNING FOR NLP, 2018, : 64 - 68
  • [9] Multi-label Text Classification with Deep Neural Networks
    Chen, Yun
    Xiao, Bo
    Lin, Zhiqing
    Dai, Cheng
    Li, Zuochao
    Yang, Liping
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 409 - 413
  • [10] Two-Level Neural Network for Multi-label Document Classification
    Lenc, Ladislav
    Kral, Pavel
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 368 - 375