Neural Domain Adaptation with Contextualized Character Embedding for Chinese Word Segmentation

被引:0
|
作者
Bao, Zuyi [1 ]
Li, Si [1 ]
Gao, Sheng [1 ]
Xu, Weiran [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Chinese word segmentation; Contextualized character embedding; Domain adaptation; Neural network;
D O I
10.1007/978-3-319-73618-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There has a large scale annotated newswire data for Chinese word segmentation. However, some research proves that the performance of the segmenter has significant decrease when applying the model trained on the newswire to other domain, such as patent and literature. The same character appeared in different words may be in different position and with different meaning. In this paper, we introduce contextualized character embedding to neural domain adaptation for Chinese word segmentation. The contextualized character embedding aims to capture the useful dimension in embedding for target domain. The experiment results show that the proposed method achieves competitive performance with previous Chinese word segmentation domain adaptation methods.
引用
收藏
页码:419 / 430
页数:12
相关论文
共 50 条
  • [1] Neural Domain Adaptation or Chinese Word Segmentation
    Bao, Zuyi
    Li, Si
    Xu, Weiran
    Gao, Sheng
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 131 - 134
  • [2] Contextualized Character Embedding with Multi-Sequence LSTM for Automatic Word Segmentation
    Lee, Hyunyoung
    Kang, Seungshik
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (11) : 2371 - 2378
  • [3] Domain Adaptation for Arabic Cross-Domain and Cross-Dialect Sentiment Analysis from Contextualized Word Embedding
    El Mekki, Abdellah
    El Mahdaouy, Abdelkader
    Berrada, Ismail
    Khoumsi, Ahmed
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2824 - 2837
  • [4] Convolutional Neural Network with Contextualized Word Embedding for Text Classification
    Fan, Gaoyang
    Zhu, Cui
    Zhu, Wenjun
    [J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [5] Chinese Word Segmentation with Character Abstraction
    Tian, Le
    Qiu, Xipeng
    Huang, Xuanjing
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, 2013, 8208 : 36 - 43
  • [6] An improved neural network for domain adaptive Chinese word segmentation
    Jiang, Ming
    Huang, Tao
    Zhang, Min
    Tang, Jingfan
    Liu, Zhiyong
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2020, 20 (04) : 1073 - 1083
  • [7] Integrating Character Representations into Chinese Word Embedding
    Leshan Normal University, China
    [J]. Lect. Notes Comput. Sci.,
  • [8] Integrating Character Representations into Chinese Word Embedding
    Chen, Xingyuan
    Jin, Peng
    McCarthy, Diana
    Carroll, John
    [J]. CHINESE LEXICAL SEMANTICS, CLSW 2016, 2016, 10085 : 335 - 349
  • [9] A Deep Convolutional Neural Model for Character-Based Chinese Word Segmentation
    Xie, Zhipeng
    Hu, Junfeng
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 380 - 392
  • [10] Enhancing Chinese Word Segmentation with Character Clustering
    Liu, Yijia
    Che, Wanxiang
    Liu, Ting
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, 2013, 8208 : 52 - 60