Paraphrase Identification Based on Weighted URAE, Unit Similarity and Context Correlation Feature

被引:4
|
作者
Zhou, Jie [1 ]
Liu, Gongshen [1 ]
Sun, Huanrong [2 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China
[2] SJTU Shanghai Songheng Informat Content Anal Join, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Paraphrase identification; Recursive Autoencoders; Phrase embedding; Sentence embedding; Deep learning; Semantic feature;
D O I
10.1007/978-3-319-99501-4_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A deep learning model adaptive to both sentence-level and article-level paraphrase identification is proposed in this paper. It consists of pairwise unit similarity feature and semantic context correlation feature. In this model, sentences are represented by word and phrase embedding while articles are represented by sentence embedding. Those phrase and sentence embedding are learned from parse trees through Weighted Unfolding Recursive Autoencoders (WURAE), an unsupervised learning algorithm. Then, unit similarity matrix is calculated by matching the pairwise lists of embedding. It is used to extract the pairwise unit similarity feature through CNN and k-max pooling layers. In addition, semantic context correlation feature is taken into account, which is captured by the combination of CNN and LSTM. CNN layers learn collocation information between adjacent units while LSTM extracts the long-term dependency feature of the text based on the output of CNN. This model is experimented on a famous English sentence paraphrase corpus, MSRPC, and a Chinese article paraphrase corpus. The results show that the deep semantic feature of text could be extracted based on WURAE, unit similarity and context correlation feature. We release our code of WURAE, deep learning model for paraphrase identification and pre-trained phrase end sentence embedding data for use by the community.
引用
收藏
页码:41 / 53
页数:13
相关论文
共 50 条
  • [31] Feature reduction for imbalanced data classification using similarity-based feature clustering with adaptive weighted K-nearest neighbors
    Sun, Lin
    Zhang, Jiuxiao
    Ding, Weiping
    Xu, Jiucheng
    INFORMATION SCIENCES, 2022, 593 : 591 - 613
  • [32] Feature-Based Correlation and Topological Similarity for Interbeat Interval Estimation Using Ultrawideband Radar
    Sakamoto, Takuya
    Imasaka, Ryohei
    Taki, Hirofumi
    Sato, Toru
    Yoshioka, Mototaka
    Inoue, Kenichi
    Fukuda, Takeshi
    Sakai, Hiroyuki
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2016, 63 (04) : 747 - 757
  • [33] EEG-based biometric identification using frequency-weighted power feature
    Monsy, Jijomon Chettuthara
    Vinod, Achutavarrier Prasad
    IET BIOMETRICS, 2020, 9 (06) : 251 - 258
  • [34] A Similarity-Based Burst Bubble Recognition Using Weighted Normalized Cross Correlation and Chamfer Distance
    Zhang, Hu
    Tang, Zhaohui
    Xie, Yongfang
    Gao, Xiaoliang
    Chen, Qing
    Gui, Weihua
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (06) : 4077 - 4089
  • [35] A confidence map and pixel-based weighted correlation for PRNU-based camera identification
    Chan, Lit-Hung
    Law, Ngai-Fong
    Siu, Wan-Chi
    DIGITAL INVESTIGATION, 2013, 10 (03) : 215 - 225
  • [36] Feature Similarity and Frequency-Based Weighted Visual Words Codebook Learning Scheme for Human Action Recognition
    Nazir, Saima
    Yousaf, Muhammad Haroon
    Velastin, Sergio A.
    IMAGE AND VIDEO TECHNOLOGY (PSIVT 2017), 2018, 10749 : 326 - 336
  • [37] Quality improvement of motion-compensated frame interpolation by self-similarity based context feature
    Ran Li
    Peinan Hao
    Fengyuan Sun
    Yanling Li
    Lei You
    Multimedia Tools and Applications, 2022, 81 : 24301 - 24318
  • [38] Quality improvement of motion-compensated frame interpolation by self-similarity based context feature
    Li, Ran
    Hao, Peinan
    Sun, Fengyuan
    Li, Yanling
    You, Lei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (17) : 24301 - 24318
  • [39] A Feature Correlation-based Fusion Method for Fingerprint and Palmprint Identification Systems
    Soviany, Sorin
    Puscoci, Sorin
    2013 E-HEALTH AND BIOENGINEERING CONFERENCE (EHB), 2013,
  • [40] Auto-correlation Based Feature Extraction Approach for EEG Alcoholism Identification
    Sadiq, Muhammad Tariq
    Siuly, Siuly
    Rehman, Ateeq Ur
    Wang, Hua
    HEALTH INFORMATION SCIENCE, HIS 2021, 2021, 13079 : 47 - 58