Visual Exploration of Semantic Relationships in Neural Word Embeddings

被引：61

作者：

Liu, Shusen ^{[1
]}

Bremer, Peer-Timo ^{[1
]}

Thiagarajan, Jayaraman J. ^{[1
]}

Srikumar, Vivek ^{[3
]}

Wang, Bei ^{[2
]}

Livnat, Yarden ^{[2
]}

Pascucci, Valerio ^{[2
]}

机构：

[1] Lawrence Livermore Natl Lab, Lawrence, CA 94550 USA

[2] Univ Utah, SCI Inst, Salt Lake City, UT 84112 USA

[3] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA

来源：

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS | 2018年 / 24卷 / 01期

基金：

美国国家科学基金会;

关键词：

Natural Language Processing; Word Embedding; High-Dimensional Data; DIMENSIONALITY REDUCTION; VISUALIZATION; QUALITY;

D O I：

10.1109/TVCG.2017.2745141

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Constructing distributed representations for words through neural language models and using the resulting vector spaces for analysis has become a crucial component of natural language processing (NLP). However, despite their widespread application, little is known about the structure and properties of these spaces. To gain insights into the relationship between words, the NLP community has begun to adapt high-dimensional visualization techniques. In particular, researchers commonly use t-distributed stochastic neighbor embeddings (t-SNE) and principal component analysis (PCA) to create two-dimensional embeddings for assessing the overall structure and exploring linear relationships (e.g., word analogies), respectively. Unfortunately, these techniques often produce mediocre or even misleading results and cannot address domain-specific visualization challenges that are crucial for understanding semantic relationships in word embeddings. Here, we introduce new embedding techniques for visualizing semantic and syntactic analogies, and the corresponding tests to determine whether the resulting views capture salient structures. Additionally, we introduce two novel views for a comprehensive study of analogy relationships. Finally, we augment t-SNE embeddings to convey uncertainty information in order to allow a reliable interpretation. Combined, the different views address a number of domain-specific tasks difficult to solve with existing tools.

引用

页码：553 / 562

页数：10

共 50 条

[21] Dense Embeddings Preserving the Semantic Relationships in WordNet
Zhang, Canlin
Liu, Xiuwen
[J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[22] Semantic Equivalence in Birth Stories: Application of Word Embeddings
Bubenhofer, Noah
[J]. ZEITSCHRIFT FUR GERMANISTISCHE LINGUISTIK, 2020, 48 (03): : 562 - 589
[23] Semantic Comparison of Driving Sequences by Adaptation of Word Embeddings
Ries, Lennart
Stumpf, Maximilian
Bach, Johannes
Sax, Eric
[J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
[24] Exploring Implicit Semantic Constraints for Bilingual Word Embeddings
Jinsong Su
Zhenqiao Song
Yaojie Lu
Mu Xu
Changxing Wu
Yidong Chen
[J]. Neural Processing Letters, 2018, 48 : 1073 - 1088
[25] Short texts semantic similarity based on word embeddings
Babic, Karlo
Martincic-Ipsic, Sanda
Mestrovic, Ana
Guerra, Francesco
[J]. CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS (CECIIS 2019), 2019, : 27 - 33
[26] Improved Learning of Chinese Word Embeddings with Semantic Knowledge
Yang, Liner
Sun, Maosong
[J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015), 2015, 9427 : 15 - 25
[27] Exploring Implicit Semantic Constraints for Bilingual Word Embeddings
Su, Jinsong
Song, Zhenqiao
Lu, Yaojie
Xu, Mu
Wu, Changxing
Chen, Yidong
[J]. NEURAL PROCESSING LETTERS, 2018, 48 (02) : 1073 - 1088
[28] DEEP WORD EMBEDDINGS FOR VISUAL SPEECH RECOGNITION
Stafylakis, Themos
Tzimiropoulos, Georgios
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4974 - 4978
[29] Enriching Portuguese Word Embeddings with Visual Information
Consoli, Bernardo Scapini
Vieira, Renata
[J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022, 2022, 13208 : 435 - 440
[30] Gated Recurrent Capsules for Visual Word Embeddings
Francis, Danny
Huet, Benoit
Merialdo, Bernard
[J]. MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 278 - 290

← 1 2 3 4 5 →