Doc2Vec, SBERT, InferSent, and USE Which embedding technique for noun phrases?

被引:3
|
作者
Ajallouda, Lahbib [1 ]
Najmani, Kawtar [2 ]
Zellou, Ahmed [1 ]
Benlahmar, El Habib [2 ]
机构
[1] Mohammed V Univ, SPM ENSIAS, Rabat, Morocco
[2] Fac Sci Ben Msik, Dept Math & Comp Sci, Casablanca, Morocco
关键词
Phrase embedding techniques; Natural language processing; Noun phrases; Empirical study;
D O I
10.1109/IRASET52964.2022.9738300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phrase embedding is a technique of representing phrases in vector space. A very high effort has been made to develop this technique to improve tasking in various natural language processing (NLP) applications. The evaluation of phrase embedding has been presented in many studies, but most of them focused on the intrinsic or extrinsic evaluation process regardless of the type of the phrase (noun phrases, Verb phrases...). In the literature, there is no study evaluating the embedding of noun phrases, knowing that this type is used by many NLP applications, such as automatic key-phrase extraction (AKE), information retrieval, and question answering. In this article, we will present an empirical study to compare the most common phrase embedding techniques, to determine the most suitable for representing noun phrases. Dataset used in the comparison process consists of the noun phrases from the Inspec and SemEval2010 datasets, to which we have added their manually defined synonyms.
引用
收藏
页码:548 / 552
页数:5
相关论文
共 50 条
  • [1] Semantic Detection of Targeted Attacks Using DOC2VEC Embedding
    El-Rahmany, Mariam S.
    Mohamed, Ensaf Hussein
    Haggag, Mohamed H.
    [J]. JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS, 2021, 17 (04) : 334 - 341
  • [2] Bug Prediction Using Source Code Embedding Based on Doc2Vec
    Aladics, Tamas
    Jasz, Judit
    Ferenc, Rudolf
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT VII, 2021, 12955 : 382 - 397
  • [3] Topic recommendation using Doc2Vec
    Karvelis, Petros
    Gavrilis, Dimitris
    Georgoulas, George
    Stylios, Chrysostomos
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [4] Bangla news recommendation using doc2vec
    Nandi, Rabindra Nath
    Zaman, M. M. Arefin
    Al Muntasir, Tareq
    Sumit, Sakhawat Hosain
    Sourov, Tanvir
    Rahman, Md. Jamil-Ur
    [J]. 2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP), 2018,
  • [5] SAO2Vec: Development of an algorithm for embedding the subject-action-object (SAO) structure using Doc2Vec
    Kim, Sunhye
    Park, Inchae
    Yoon, Byungun
    [J]. PLOS ONE, 2020, 15 (02):
  • [6] Chinese abstraction algorithm combining Doc2Vec and TextRank
    Mou, Jinjun
    Xiong, Zhibin
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 149 - 149
  • [7] RETRO-REMOTE SENSING WITH DOC2VEC ENCODING
    Bejiga, Mesay Belete
    Hoxha, Genc
    Melgani, Farid
    [J]. 2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2020, : 89 - 92
  • [8] Poem Generation using Transformers and Doc2Vec Embeddings
    Santillan, Marvin C.
    Azcarraga, Arnulfo P.
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] A Study of the Chinese spam Classification with Doc2vec and CNN
    Gong, Hechen
    You, Fucheng
    Wang, Shaomei
    [J]. 2019 INTERNATIONAL CONFERENCE ON ADVANCED ELECTRONIC MATERIALS, COMPUTERS AND MATERIALS ENGINEERING (AEMCME 2019), 2019, 563
  • [10] Research on detection methods based on Doc2vec abnormal comments
    Chang, Wenbing
    Xu, Zhenzhong
    Zhou, Shenghan
    Cao, Wen
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 656 - 662