Examining the effect of whitening on static and contextualized word embeddings

被引:4
|
作者
Sasaki, Shota [1 ,2 ]
Heinzerling, Benjamin [1 ,2 ]
Suzuki, Jun [1 ,2 ]
Inui, Kentaro [1 ,2 ]
机构
[1] RIKEN, Sendai, Miyagi 9808579, Japan
[2] Tohoku Univ, Sendai, Miyagi 9808579, Japan
关键词
Static word embeddings; Contextualized word embeddings; Whitening; Frequency bias;
D O I
10.1016/j.ipm.2023.103272
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Static word embeddings (SWE) and contextualized word embeddings (CWE) are the foundation of modern natural language processing. However, these embeddings suffer from spatial bias in the form of anisotropy, which has been demonstrated to reduce their performance. A method to alleviate the anisotropy is the "whitening"transformation. Whitening is a standard method in signal processing and other areas, however, its effect on SWE and CWE is not well understood. In this study, we conduct an experiment to elucidate the effect of whitening on SWE and CWE. The results indicate that whitening predominantly removes the word frequency bias in SWE, and biases other than the word frequency bias in CWE.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Contextualized Word Embeddings in Azerbaijani Language
    Alizada, Tural
    Suleymanov, Umid
    Rustamov, Zaid
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES, AICT 2024, 2024,
  • [2] Gender Bias in Contextualized Word Embeddings
    Zhao, Jieyu
    Wangt, Tianlu
    Yatskart, Mark
    Cotterell, Ryan
    Ordonezt, Vicente
    Chang, Kai-Wei
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 629 - 634
  • [3] Retrofitting Contextualized Word Embeddings with Paraphrases
    Shi, Weijia
    Chen, Muhao
    Zhou, Pei
    Chang, Kai-Wei
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1198 - 1203
  • [4] UTILIZING CONTEXTUALIZED WORD EMBEDDINGS FOR TEXT MATCHING
    Yu, Hao
    Chen, Xiaoyang
    Zhou, Ying
    PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2020, : 54 - 59
  • [5] Classification and Clustering of Arguments with Contextualized Word Embeddings
    Reimers, Nils
    Schiller, Benjamin
    Beck, Tilman
    Daxenberger, Johannes
    Stab, Christian
    Gurevych, Iryna
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 567 - 578
  • [6] Evaluating the Underlying Gender Bias in Contextualized Word Embeddings
    Basta, Christine
    Costa-jussa, Marta R.
    Casas, Noe
    GENDER BIAS IN NATURAL LANGUAGE PROCESSING (GEBNLP 2019), 2019, : 33 - 39
  • [7] Prepositional Polysemy through the lens of contextualized word embeddings
    Fonteyn, Lauren
    COGNITEXTES, 2021, 21
  • [8] Deep Contextualized Word Embeddings for Universal Dependency Parsing
    Liu, Yijia
    Che, Wanxiang
    Wang, Yuxuan
    Zheng, Bo
    Qin, Bing
    Liu, Ting
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [9] Contextualized Word Embeddings Expose Ethnic Biases in News
    Thijs, Guusje
    Trilling, Damian
    Kroon, Anne C.
    16TH ACM WEB SCIENCE CONFERENCE, WEBSCIENCE 2024, 2024, : 290 - 295
  • [10] Extensive study on the underlying gender bias in contextualized word embeddings
    Christine Basta
    Marta R. Costa-jussà
    Noe Casas
    Neural Computing and Applications, 2021, 33 : 3371 - 3384