Word n-gram attention models for sentence similarity and inference

被引:29
|
作者
Lopez-Gazpio, I [1 ]
Maritxalar, M. [1 ]
Lapata, M. [2 ]
Agirre, E. [1 ]
机构
[1] Univ Basque Country, UPV EHU, Comp Sci Fac, IXA NLP Grp, Manuel Lardizabal 1, Donostia San Sebastian 20018, Basque Country, Spain
[2] Univ Edinburgh, Sch Informat, Inst Language Cognit & Computat, 10 Crichton St, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
Attention models; Deep learning; Natural language understanding; Natural Language Inference; Semantic textual similarity;
D O I
10.1016/j.eswa.2019.04.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic Textual Similarity and Natural Language Inference are two popular natural language understanding tasks used to benchmark sentence representation models where two sentences are paired. In such tasks sentences are represented as bag of words, sequences, trees or convolutions, but the attention model is based on word pairs. In this article we introduce the use of word n-grams in the attention model. Our results on five datasets show an error reduction of up to 41% with respect to the word-based attention model. The improvements are especially relevant with low data regimes and, in the case of natural language inference, on the recently released hard subset of Natural Language Inference datasets. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [1] Contributions to language understanding: n-gram attention and alignments for interpretable similarity and inference
    Lopez-Gazpio, Inigo
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (62): : 99 - 102
  • [2] Document classification using n-gram and word semantic similarity
    Ren, Mei-Ying
    Kang, Sinjae
    [J]. International Journal of Future Generation Communication and Networking, 2015, 8 (08): : 111 - 118
  • [3] Bag-Of-Word normalized n-gram models
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1594 - 1597
  • [4] N-gram similarity and distance
    Kondrak, Grzegorz
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2005, 3772 : 115 - 126
  • [5] Dependency-based n-gram models for general purpose sentence realisation
    Guo, Yuqing
    Wang, Haifeng
    Van Genabith, Josef
    [J]. NATURAL LANGUAGE ENGINEERING, 2011, 17 : 455 - 483
  • [6] Collaborative Attention Network with Word and N-Gram Sequences Modeling for Sentiment Classification
    Bao, Junwei
    Zhang, Liang
    Han, Bo
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 79 - 92
  • [7] Evaluating n-gram Models for a Bilingual Word Sense Disambiguation Task
    Pinto, David
    Vilarino, Darnes
    Balderas, Carlos
    Tovar, Mireya
    Beltran, Beatriz
    [J]. COMPUTACION Y SISTEMAS, 2011, 15 (02): : 209 - 220
  • [8] Chinese new word identification using N-gram and PPM Models
    Li, Dun
    Tu, Wei
    Shi, Lei
    [J]. EMERGING SYSTEMS FOR MATERIALS, MECHANICS AND MANUFACTURING, 2012, 109 : 612 - 616
  • [9] N-gram approach for a URL Similarity Measure
    Singh, Neetu
    Chaudhari, Narendra S.
    [J]. 2016 1ST INDIA INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (IICIP), 2016,
  • [10] Generalized N-gram measures for melodic similarity
    Frieler, Klaus
    [J]. Data Science and Classification, 2006, : 289 - 298