Language Modeling with Sparse Product of Sememe Experts

被引:0
|
作者
Gu, Yihong [1 ,2 ]
Yan, Jun [1 ,3 ]
Zhu, Hao [1 ,2 ]
Liu, Zhiyuan [1 ,2 ]
Xie, Ruobing [4 ]
Sun, Maosong [1 ,2 ]
Lin, Fen [4 ]
Lin, Leyu [4 ]
机构
[1] Tsinghua Univ, Inst Artificial Intelligence, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China
[2] Tsinghua Univ, Dept CST, Beijing, Peoples R China
[3] Tsinghua Univ, Dept EE, Beijing, Peoples R China
[4] Tencent, WeChat Search Applicat Dept, Search Prod Ctr, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most language modeling methods rely on large-scale data to statistically learn the sequential patterns of words. In this paper, we argue that words are atomic language units but not necessarily atomic semantic units. Inspired by HowNet, we use sememes, the minimum semantic units in human languages, to represent the implicit semantics behind words for language modeling, named Sememe-Driven Language Model (SDLM). More specifically, to predict the next word, SDLM first estimates the sememe distribution given textual context. Afterwards, it regards each sememe as a distinct semantic expert, and these experts jointly identify the most probable senses and the corresponding word. In this way, SDLM enables language models to work beyond word-level manipulation to fine-grained sememe-level semantics, and offers us more powerful tools to fine-tune language models and improve the interpretability as well as the robustness of language models. Experiments on language modeling and the downstream application of headline generation demonstrate the significant effectiveness of SDLM. Source code and data used in the experiments can be accessed at https:// github.com/thunlp/SDLM- pytorch.
引用
收藏
页码:4642 / 4651
页数:10
相关论文
共 50 条
  • [1] Correlated product of experts for sparse Gaussian process regression
    Schuerch, Manuel
    Azzimonti, Dario
    Benavoli, Alessio
    Zaffalon, Marco
    [J]. MACHINE LEARNING, 2023, 112 (05) : 1411 - 1432
  • [2] Correlated product of experts for sparse Gaussian process regression
    Manuel Schürch
    Dario Azzimonti
    Alessio Benavoli
    Marco Zaffalon
    [J]. Machine Learning, 2023, 112 : 1411 - 1432
  • [3] Modeling Semantic Compositionality with Sememe Knowledge
    Qi, Fanchao
    Huang, Junjie
    Yang, Chenghao
    Liu, Zhiyuan
    Chen, Xiao
    Liu, Qun
    Sun, Maosong
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5706 - 5715
  • [4] xOWL an Executable Modeling Language for Domain Experts
    Wouters, Laurent
    Gervais, Marie-Pierre
    [J]. 15TH IEEE INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2011), 2011, : 215 - 224
  • [5] PML - PRODUCT MODELING LANGUAGE
    GU, PH
    [J]. COMPUTERS IN INDUSTRY, 1992, 18 (03) : 265 - 277
  • [6] Spoken Language Understanding with Sememe Knowledge as Domain Knowledge
    Li, Sixia
    Dang, Jianwu
    Wang, Longbiao
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [7] Modeling experts' consensual judgments for new product entry timing
    Mak, B
    Bui, T
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 1996, 26 (05): : 659 - 667
  • [8] Data Selection for Language Modeling Using Sparse Representations
    Sethy, Abhinav
    Sainath, Tara N.
    Ramabhadran, Bhuvana
    Kanevsky, Dimitri
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2258 - +
  • [9] The topographic product of experts
    Fyfe, C
    [J]. ARTIFICIAL NEURAL NETWORKS: BIOLOGICAL INSPIRATIONS - ICANN 2005, PT 1, PROCEEDINGS, 2005, 3696 : 397 - 402
  • [10] Product failure modeling method based on Altarica language
    Long, Gao
    Liang, Ao
    [J]. 2017 PROGNOSTICS AND SYSTEM HEALTH MANAGEMENT CONFERENCE (PHM-HARBIN), 2017, : 959 - 964