Evaluation of Word Embedding via Domain Keywords

被引:0
|
作者
Fu, Qunchao [1 ,2 ]
Li, Zongyang [1 ,2 ]
Han, Xu [1 ,2 ]
Wang, Cong [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Software Engn, Beijing 100876, Peoples R China
[2] BUPT, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing, Peoples R China
关键词
Word embedding; Intrinsic evaluations; Domain Keywords;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Word embeddings, unsupervisedly learned, have proven to he very effective and provide semantic and syntactic information in most NLP tasks. Most common intrinsic evaluations of word embeddings use the similarity of words as core. Notwithstanding, these frequently correspond inadequately with how well the word embeddings perform as features in actual downstream tasks. We present VECDS (Vector Domain Score) based on the corresponding domain keywords, like high frequency or extracted by human, in downstream evaluation tasks. The domain keywords is more important for downstream than other common vocabulary.
引用
收藏
页码:290 / 294
页数:5
相关论文
共 50 条
  • [21] aiai at the FinSim-2 task: Finance Domain Terms Automatic Classification Via Word Ontology and Embedding
    Ke Tian
    Hua Chen
    [J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 320 - 322
  • [22] Word Embedding Evaluation in Downstream Tasks and Semantic Analogies
    Santos, Joaquim
    Consoli, Bernardo
    Vieira, Renata
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4828 - 4834
  • [23] Tagged Video Retrieval System using Domain Ontology and Word Embedding
    Hahm, Gyeong-june
    Kwak, Chang-uk
    Kin, Sun-joong
    [J]. 2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1100 - 1102
  • [24] Neural Domain Adaptation with Contextualized Character Embedding for Chinese Word Segmentation
    Bao, Zuyi
    Li, Si
    Gao, Sheng
    Xu, Weiran
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 419 - 430
  • [25] Vector representation of Internet Domain Names using a Word Embedding technique
    Lopez, Waldemar
    Merlino, Jorge
    Rodriguez-Bocca, Pablo
    [J]. 2017 XLIII LATIN AMERICAN COMPUTER CONFERENCE (CLEI), 2017,
  • [26] Identification of Domain-Specific Senses Based on Word Embedding Learning
    Wangpoonsarp, Attaporn
    Fukumoto, Fumiyo
    [J]. HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 341 - 350
  • [27] Research on Disease Identification in Chinese Domain Based on Word Embedding Technology
    Chen, Chunling
    Gao, Yunpeng
    Ye, Ming
    Guo, Yongan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [28] Cross-Domain Sentiment Encoding through Stochastic Word Embedding
    Hao, Yanbin
    Mu, Tingting
    Hong, Richang
    Wang, Meng
    Liu, Xueliang
    Goulermas, John Y.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (10) : 1909 - 1922
  • [29] Unsupervised Feature Selection for Text Classification via Word Embedding
    Rui, Weikang
    Liu, Jinwen
    Jia, Yawei
    [J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2016, : 37 - 41
  • [30] Predicting Abstract Keywords by Word Vectors
    Li, Qing
    Zhu, Wenhao
    Lu, Zhiguo
    [J]. HIGH PERFORMANCE COMPUTING AND APPLICATIONS, HPCA 2015, 2016, 9576 : 185 - 195