Examining the effect of whitening on static and contextualized word embeddings

被引：4

作者：

Sasaki, Shota ^{[1
,2
]}

Heinzerling, Benjamin ^{[1
,2
]}

Suzuki, Jun ^{[1
,2
]}

Inui, Kentaro ^{[1
,2
]}

机构：

[1] RIKEN, Sendai, Miyagi 9808579, Japan

[2] Tohoku Univ, Sendai, Miyagi 9808579, Japan

来源：

INFORMATION PROCESSING & MANAGEMENT | 2023年 / 60卷 / 03期

关键词：

Static word embeddings; Contextualized word embeddings; Whitening; Frequency bias;

D O I：

10.1016/j.ipm.2023.103272

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Static word embeddings (SWE) and contextualized word embeddings (CWE) are the foundation of modern natural language processing. However, these embeddings suffer from spatial bias in the form of anisotropy, which has been demonstrated to reduce their performance. A method to alleviate the anisotropy is the "whitening"transformation. Whitening is a standard method in signal processing and other areas, however, its effect on SWE and CWE is not well understood. In this study, we conduct an experiment to elucidate the effect of whitening on SWE and CWE. The results indicate that whitening predominantly removes the word frequency bias in SWE, and biases other than the word frequency bias in CWE.

引用

页数：10

共 50 条

[1] Contextualized Word Embeddings in Azerbaijani Language
Alizada, Tural
Suleymanov, Umid
Rustamov, Zaid
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES, AICT 2024, 2024,
[2] Gender Bias in Contextualized Word Embeddings
Zhao, Jieyu
Wangt, Tianlu
Yatskart, Mark
Cotterell, Ryan
Ordonezt, Vicente
Chang, Kai-Wei
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 629 - 634
[3] Retrofitting Contextualized Word Embeddings with Paraphrases
Shi, Weijia
Chen, Muhao
Zhou, Pei
Chang, Kai-Wei
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1198 - 1203
[4] UTILIZING CONTEXTUALIZED WORD EMBEDDINGS FOR TEXT MATCHING
Yu, Hao
Chen, Xiaoyang
Zhou, Ying
PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2020, : 54 - 59
[5] Classification and Clustering of Arguments with Contextualized Word Embeddings
Reimers, Nils
Schiller, Benjamin
Beck, Tilman
Daxenberger, Johannes
Stab, Christian
Gurevych, Iryna
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 567 - 578
[6] Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
Basta, Christine
Costa-jussa, Marta R.
Casas, Noe
GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 33 - 39
[7] Prepositional Polysemy through the lens of contextualized word embeddings
Fonteyn, Lauren
COGNITEXTES, 2021, 21
[8] Deep Contextualized Word Embeddings for Universal Dependency Parsing
Liu, Yijia
Che, Wanxiang
Wang, Yuxuan
Zheng, Bo
Qin, Bing
Liu, Ting
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
[9] Contextualized Word Embeddings Expose Ethnic Biases in News
Thijs, Guusje
Trilling, Damian
Kroon, Anne C.
16TH ACM WEB SCIENCE CONFERENCE, WEBSCIENCE 2024, 2024, : 290 - 295
[10] Extensive study on the underlying gender bias in contextualized word embeddings
Christine Basta
Marta R. Costa-jussà
Noe Casas
Neural Computing and Applications, 2021, 33 : 3371 - 3384

← 1 2 3 4 5 →