Detecting Paraphrases for Portuguese using Word and Sentence Embeddings

被引:3
|
作者
Souza, Marlo [1 ]
Sanches, Leandro M. P. [1 ]
机构
[1] Univ Fed Bahia, Salvador, BA, Brazil
来源
LINGUAMATICA | 2018年 / 10卷 / 02期
关键词
Paraphrase Identification; Semantic Textual Similarity; Sentence Embeddings;
D O I
10.21814/lm.10.2.286
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Paraphrase detection/identification is the task of determining whether two or more sentences of arbitrary length possess the same meaning. Methods to solve this task have many potential applications in Natural Language Processing systems. This work investigates the combination of different methods of sentence representation in a vector space model of language and linear classifiers to the problem of paraphrase identification for the Portuguese language. The results obtained in this work are inferior to those obtained for the related task of recognizing textual entailment in the ASSIN evaluation for the Portuguese language, but we point out that in this work we investigate the application of sentence embeddings to the problem of paraphrase detection, as such other features usually explored in systems for this task may be trivially incorporated into our method to improve performance.
引用
收藏
页码:31 / 44
页数:14
相关论文
共 50 条
  • [21] Sentence Selection Strategies for Distilling Word Embeddings from BERT
    Wang, Yixiao
    Bouraoui, Zied
    Espinosa-Anke, Luis
    Schockaert, Steven
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2591 - 2600
  • [22] A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection
    Xu, Chenglin
    Xie, Lei
    Xiao, Xiong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 1063 - 1075
  • [23] A Bidirectional LSTM Approach with Word Embeddings for Sentence Boundary Detection
    Chenglin Xu
    Lei Xie
    Xiong Xiao
    Journal of Signal Processing Systems, 2018, 90 : 1063 - 1075
  • [24] On Character vs Word Embeddings as Input for English Sentence Classification
    Hammerton, James
    Vintro, Merce
    Kapetanakis, Stelios
    Sama, Michele
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 550 - 566
  • [25] Joint Model Using Character and Word Embeddings for Detecting Internet Slang Words
    Liu, Yihong
    Seki, Yohei
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, 13133 LNCS : 18 - 33
  • [26] Joint Model Using Character and Word Embeddings for Detecting Internet Slang Words
    Liu, Yihong
    Seki, Yohei
    TOWARDS OPEN AND TRUSTWORTHY DIGITAL SOCIETIES, ICADL 2021, 2021, 13133 : 18 - 33
  • [27] Semi-supervised Learning of Dialogue Acts Using Sentence Similarity Based on Word Embeddings
    Yang, Xiaohao
    Liu, Jia
    Chen, Zhenfeng
    Wu, Weilan
    2014 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), VOLS 1-2, 2014, : 882 - 886
  • [28] Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering
    Esposito, Massimo
    Damiano, Ernanuele
    Minutolo, Aniello
    De Pietro, Giuseppe
    Fujita, Hamido
    INFORMATION SCIENCES, 2020, 514 : 88 - 105
  • [29] Portuguese word embeddings for the oil and gas industry: Development and evaluation
    Magalhaes Gomes, Diogo da Silva
    Cordeiro, Fabio Correa
    Consoli, Bernardo Scapini
    Santos, Nikolas Lacerda
    Moreira, Viviane Pereira
    Vieira, Renata
    Moraes, Silvia
    Evsukoff, Alexandre Goncalves
    COMPUTERS IN INDUSTRY, 2021, 124
  • [30] Improving POS Tagging Across Portuguese Variants with Word Embeddings
    Fonseca, Erick Rocha
    Aluisio, Sandra Maria
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE (PROPOR 2016), 2016, 9727 : 227 - 232