A Multi-Layer System for Semantic Textual Similarity

被引:3
|
作者
Ngoc Phuoc An Vo [1 ]
Popescu, Octavian [2 ]
机构
[1] Xerox Res Ctr Europe, Meylan, France
[2] IBM Corp, TJ Watson Res, Yorktown Hts, NY USA
来源
KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1 | 2016年
关键词
Machine Learning; Natural Language Processing (NLP); Semantic Textual Similarity (STS);
D O I
10.5220/0006045800560067
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building a system able to cope with various phenomena which falls under the umbrella of semantic similarity is far from trivial. It is almost always the case that the performances of a system do not vary consistently or predictably from corpora to corpora. We analyzed the source of this variance and found that it is related to the word-pair similarity distribution among the topics in the various corpora. Then we used this insight to construct a 4-module system that would take into consideration not only string and semantic word similarity, but also word alignment and sentence structure. The system consistently achieves an accuracy which is very close to the state of the art, or reaching a new state of the art. The system is based on a multi-layer architecture and is able to deal with heterogeneous corpora which may not have been generated by the same distribution.
引用
收藏
页码:56 / 67
页数:12
相关论文
共 50 条
  • [31] Probabilistic Soft Logic for Semantic Textual Similarity
    Beltagy, Islam
    Erk, Katrin
    Mooney, Raymond
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 1210 - 1219
  • [32] MedSTS: a resource for clinical semantic textual similarity
    Wang, Yanshan
    Afzal, Naveed
    Fu, Sunyang
    Wang, Liwei
    Shen, Feichen
    Rastegar-Mojarad, Majid
    Liu, Hongfang
    LANGUAGE RESOURCES AND EVALUATION, 2020, 54 (01) : 57 - 72
  • [33] Learning Semantic Textual Similarity from Conversations
    Yang, Yinfei
    Yuan, Steve
    Cer, Daniel
    Kong, Sheng-yi
    Constant, Noah
    Pilar, Petr
    Ge, Heming
    Sung, Yun-Hsuan
    Strope, Brian
    Kurzweil, Ray
    REPRESENTATION LEARNING FOR NLP, 2018, : 164 - 174
  • [34] Overview of the Evaluation of Semantic Similarity and Textual Inference
    Fonseca, Erick Rocha
    dos Santos, Leandro Borges
    Criscuolo, Marcelo
    Aluisio, Sandra Maria
    LINGUAMATICA, 2016, 8 (02): : 3 - 13
  • [35] Linguistic analysis of datasets for semantic textual similarity
    Wang, Chunlin
    Castellon, Irene
    Comelles, Elisabet
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2020, 35 (02) : 471 - 484
  • [36] Efficient Textual Similarity using Semantic MinHashing
    Nawaz, Waqas
    Baig, Maryam
    Khan, Kifayat Ullah
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 262 - 269
  • [37] MedSTS: a resource for clinical semantic textual similarity
    Yanshan Wang
    Naveed Afzal
    Sunyang Fu
    Liwei Wang
    Feichen Shen
    Majid Rastegar-Mojarad
    Hongfang Liu
    Language Resources and Evaluation, 2020, 54 : 57 - 72
  • [38] Interpretable Semantic Textual Similarity for Indonesian Sentence
    Rajagukguk, Rio Chandra
    Khodra, Masayu Leylia
    2018 5TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS (ICAICTA 2018), 2018, : 147 - 152
  • [39] Textual entailment beyond semantic similarity information
    Vazquez, Sonia
    Kozareva, Zornitsa
    Montoyo, Andres
    MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 900 - +
  • [40] Collective Human Opinions in Semantic Textual Similarity
    Wang, Yuxia
    Tao, Shimin
    Xie, Ning
    Yang, Hao
    Baldwin, Timothy
    Verspoor, Karin
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 997 - 1013