Filtering Communities in Word Co-Occurrence Networks to Foster the Emergence of Meaning

被引:0
|
作者
Beranger, Anna [1 ]
Dugue, Nicolas [1 ]
Guillot, Simon [1 ]
Prouteau, Thibault [1 ]
机构
[1] Univ Mans, LIUM, Ave Olivier Messiaen, F-72000 Le Mans, France
关键词
Word co-occurrence networks; community detection; word embedding; linguistics; interpretability;
D O I
10.1007/978-3-031-53468-3_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With SINr, we introduced a way to design graph and word embeddings based on community detection. Contrary to deep learning approaches, this approach does not require much compute and was proven to be at the state-of-the-art for interpretability in the context of word embeddings. In this paper, we investigate how filtering communities detected on word co-occurrence networks can improve performances of the approach. Community detection algorithms tend to uncover communities whose size follows a power-law distribution. Naturally, the number of activations per dimensions in SINr follows a power-law: a few dimensions are activated by many words, and many dimensions are activated by a few words. By filtering this distribution, removing part of its head and tail, we show improvement on intrinsic evaluation of the embedding while dividing their dimensionality by five. In addition, we show that these results are stable through several runs, thus defining a subset of distinctive features to describe a given corpus.
引用
收藏
页码:377 / 388
页数:12
相关论文
共 50 条
  • [1] Co-occurrence Networks for Word Sense Induction
    Humonen, Innokentiy S.
    Makarov, Ilya
    [J]. 2023 IEEE 21ST WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, SAMI, 2023, : 97 - 102
  • [2] Conceptual grouping in word co-occurrence networks
    Veling, A
    van der Weerd, P
    [J]. IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 694 - 699
  • [3] Sentiment and structure in word co-occurrence networks on Twitter
    Mikaela Irene Fudolig
    Thayer Alshaabi
    Michael V. Arnold
    Christopher M. Danforth
    Peter Sheridan Dodds
    [J]. Applied Network Science, 7
  • [4] Spectra of English evolving word co-occurrence networks
    Liang, Wei
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2017, 468 : 802 - 808
  • [5] Construction and Analysis of Mongolian Word Co-occurrence Networks
    Bao, Lingxiong
    Dahubaiyila
    [J]. 2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 110 - 115
  • [6] Sentiment and structure in word co-occurrence networks on Twitter
    Fudolig, Mikaela Irene
    Alshaabi, Thayer
    Arnold, Michael, V
    Danforth, Christopher M.
    Dodds, Peter Sheridan
    [J]. APPLIED NETWORK SCIENCE, 2022, 7 (01)
  • [7] Language clustering with word co-occurrence networks based on parallel texts
    LIU HaiTao
    CONG Jin
    [J]. Science Bulletin, 2013, 58 (10) : 1139 - 1144
  • [8] Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks
    Akimushkin, Camilo
    Amancio, Diego Raphael
    Oliveira, Osvaldo Novais, Jr.
    [J]. PLOS ONE, 2017, 12 (01):
  • [9] Language clustering with word co-occurrence networks based on parallel texts
    Liu HaiTao
    Cong Jin
    [J]. CHINESE SCIENCE BULLETIN, 2013, 58 (10): : 1139 - 1144
  • [10] The co-occurrence of different grassland communities increases the stability of pollination networks
    Fantinato, Edy
    Del Vecchio, Silvia
    Buffa, Gabriella
    [J]. FLORA, 2019, 255 : 11 - 17