Comparison of Sentence Similarity Measures for Russian Paraphrase Identification

被引:0
|
作者
Pronoza, Ekaterina [1 ]
Yagunova, Elena [1 ]
机构
[1] St Petersburg State Univ, St Petersburg, Russia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we analyze and compare different types of sentence similarity measures applied to the problem of sentential paraphrase identification. We work with Russian, and all the experiments are conducted on the Russian paraphrase corpus we have collected from the news headlines (and are collecting at the moment). Apart from the similarity measures, we also analyze the corpus itself. As a result of the research we disprove the supposition that it is more difficult to distinguish between precise and loose paraphrases than between loose paraphrases and non-paraphrases. We also come up with the recommendations for the application of different similarity measures to identifying paraphrases derived from the news texts.
引用
收藏
页码:74 / 82
页数:9
相关论文
共 50 条
  • [21] Sentence Similarity Measures Revisited: Ranking Sentences in Pubmed Documents
    Chen, Qingyu
    Kim, Sun
    Wilbur, W. John
    Lu, Zhiyong
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 531 - 532
  • [22] Comparison of measures for haplotype similarity
    Vivien Marquard
    Lars Beckmann
    Justo L Bermejo
    Christine Fischer
    Jenny Chang-Claude
    BMC Proceedings, 1 (Suppl 1)
  • [23] Vietnamese Sentence Paraphrase Identification using Pre-trained Model and Linguistic Knowledge
    Dien Dinh
    Nguyen Le Thanh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 796 - 806
  • [24] Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction
    Pronoza, Ekaterina
    Yagunova, Elena
    Pronoza, Anton
    INFORMATION RETRIEVAL, (RUSSIR 2015), 2016, 573 : 146 - 157
  • [25] Unsupervised Citation Sentence Identification Based on Similarity Measurement
    Ou, Shiyan
    Kim, Hyonil
    TRANSFORMING DIGITAL WORLDS, ICONFERENCE 2018, 2018, 10766 : 384 - 394
  • [26] Paraphrase Identification Based on Weighted URAE, Unit Similarity and Context Correlation Feature
    Zhou, Jie
    Liu, Gongshen
    Sun, Huanrong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 41 - 53
  • [27] The Impact of Sentence Embeddings in Turkish Paraphrase Detection
    Karaoglan, Bahar
    Yorgancioglu, Hakki Engin
    Kisla, Tarik
    Kumova Metin, Senem
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [28] Urdu Short Paraphrase Detection at Sentence Level
    Hafeez, Hamza
    Muneer, Iqra
    Sharjeel, Muhammad
    Ashraf, Muhammad Adnan
    Nawab, Rao Muhammad Adeel
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (04)
  • [29] PARAPHRASE AND SENTENCE ANALYSIS, SOME INDIAN VIEWS
    CARDONA, G
    JOURNAL OF INDIAN PHILOSOPHY, 1975, 3 (3-4) : 259 - 281
  • [30] Integrating Linguistic Knowledge to Sentence Paraphrase Generation
    Lin, Zibo
    Li, Ziran
    Ding, Ning
    Zheng, Hai-Tao
    Shen, Ying
    Wang, Wei
    Zhao, Cong-Zhi
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8368 - 8375