Improve Word Mover's Distance with Part-of-Speech Tagging

被引:0
|
作者
Chen, Xiaojun [1 ]
Bai, Li [2 ]
Wang, Dakui [1 ]
Shi, Jinqiao [1 ]
机构
[1] Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Inst Informat Engn, Sch Cyber Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Mover's Distance (WMD) is a document distance metric with free parameter, intelligible interpretation and unprecedented accuracy on document classification. WMD is on the basis of word embedding and largely focuses on semantic relationships rather than syntactic relationships, which would bring some limitations on measuring document distance. To enhance the impact of syntactic information, we proposed a new method called WMD with Part-of-Speech (PWMD) that integrates part-of-speech (POS) into the original WMD model. POS is a kind of syntactic information, providing more valuable features combined with WMD in document distance metric. Two combination strategies of the POS tagging are provided in PWMD, "word level" and "document level". The results of contrastive experiments have shown that the PWMD is able to get better document distance than WMD.
引用
收藏
页码:3722 / 3728
页数:7
相关论文
共 50 条
  • [1] Part-of-Speech Tagging with Both Character and Word Information
    Zhou, You
    Liu, Fangzhou
    Proceedings of the 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016), 2016, 67 : 945 - 948
  • [2] Recursive Part-of-Speech Tagging Using Word Structures
    Chan, Samuel W. K.
    Chong, Mickey W. C.
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 419 - 425
  • [3] Part-of-speech tagging
    Martinez, Angel R.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (01): : 107 - 113
  • [4] An integrated approach to Chinese word segmentation and part-of-speech tagging
    Sun, Maosong
    Xu, Dongliang
    Tsou, Benjamin K.
    Lu, Huaming
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 299 - +
  • [5] Repairing errors for Chinese word segmentation and part-of-speech tagging
    Yao, TF
    Ding, W
    Erbach, G
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1881 - 1886
  • [6] Part-of-speech tagging for Swedish
    Prütz, K
    PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 201 - 206
  • [7] Part-of-speech tagging using word probability based on category patterns
    Kang, Mi-young
    Jung, Sung-won
    Park, Kyung-soon
    Kwon, Hyuk-chul
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 119 - +
  • [8] Deep Learning Architecture for Part-of-Speech Tagging with Word and Suffix Embeddings
    Popov, Alexander
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2016, 2016, 9883 : 68 - 77
  • [9] A Part-of-speech Tagging Model Employing Word Clustering and Syntactic Parsing
    Yuan Lichi
    CHINESE JOURNAL OF ELECTRONICS, 2014, 23 (01) : 109 - 114
  • [10] Research on the model of integrating Chinese word segmentation with part-of-speech tagging
    Tong, Xiaojun
    Cui, Minggen
    Song, Guolong
    DCABES 2007 Proceedings, Vols I and II, 2007, : 1062 - 1065