The Impact of Sentence Embeddings in Turkish Paraphrase Detection

被引:0
|
作者
Karaoglan, Bahar [1 ]
Yorgancioglu, Hakki Engin [1 ]
Kisla, Tarik [2 ]
Kumova Metin, Senem [3 ]
机构
[1] Ege Univ, Uluslararasi Bilgisayar Enstittusu, Izmir, Turkey
[2] Ege Univ, Bilgisayar & Ogret Teknol Egitimi Bolumu, Izmir, Turkey
[3] Izmir Econ Univ, Yazilim Muhendisligi Bolumu, Izmir, Turkey
关键词
paraphrasing; praphrase corpus; Word embedding; sentence embedding;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent studies, it is shown that word embeddings achieve in several natural language processing (NLP) tasks. Though paraphrase identification in Turkish is well-studied by traditional statistical NLP methods, to the best of our knowledge there exists no study where word and/or sentence embeddings are employed. In this paper, three methods, which are well-known as "using average vector for word embeddings" (AWE), "concatenated vectors for word embeddings" (CWE) and "word mover's distance word embeddings" (WMDWE) to build sentence embeddings from word embeddings are examined and their effect in performance of paraphrase identification is measured. The results are presented comparatively for English (MSRP) and Turkish (PARDER and TuPC) paraphrase corpora. The study doesn't cover the optimization of parameters used in training of word embeddings and also the features specific to Turkish langauge are not considered. Despite this naive approach, the test results obtained from PARDER corpus are inspiring that a more detailed study that involves such improvements may result with more convincing performance values.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Constructing a Turkish Corpus for Paraphrase Identification and Semantic Similarity
    Eyecioglu, Asli
    Keller, Bill
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 588 - 599
  • [42] Composition-contrastive Learning for Sentence Embeddings
    Chanchani, Sachin
    Huang, Ruihong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15836 - 15848
  • [43] SimCSE: Simple Contrastive Learning of Sentence Embeddings
    Gao, Tianyu
    Yao, Xingcheng
    Chen, Danqi
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6894 - 6910
  • [44] SEMANTIC SENTENCE EMBEDDINGS FOR PARAPHRASING AND TEXT SUMMARIZATION
    Zhang, Chi
    Sah, Shagan
    Thang Nguyen
    Peri, Dheeraj
    Loui, Alexander
    Salvaggio, Carl
    Ptucha, Raymond
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 705 - 709
  • [45] Contrastive Learning of Sentence Embeddings from Scratch
    Zhang, Junlei
    Lan, Zhenzhong
    He, Junxian
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3916 - 3932
  • [46] BioSentVec: creating sentence embeddings for biomedical texts
    Chen, Qingyu
    Peng, Yifan
    Lu, Zhiyong
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 246 - 250
  • [47] SimTDE: Simple Transformer Distillation for Sentence Embeddings
    Xie, Jian
    He, Xin
    Wang, Jiyang
    Qiu, Zimeng
    Kebarighotbi, Ali
    Ghassemi, Farhad
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2389 - 2393
  • [48] MCSE: Multimodal Contrastive Learning of Sentence Embeddings
    Zhang, Miaoran
    Mosbach, Marius
    Adelani, David Ifeoluwa
    Hedderich, Michael A.
    Klakow, Dietrich
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5959 - 5969
  • [49] Carrier Sentence Selection with Word and Context Embeddings
    Yeung, Chak Yan
    Lee, John
    Tsou, Benjamin
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 439 - 444
  • [50] Conceptual Sentence Embeddings Based on Attention Mechanism
    Wang Y.-S.
    Huang H.-Y.
    Feng C.
    Zhou Q.
    Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (07): : 1390 - 1400