Asymptotic properties of mixture-of-experts models

被引:3
|
作者
Olteanu, M. [1 ]
Rynkiewicz, J. [1 ]
机构
[1] Univ Paris 01, SAMM, EA 4543, F-75013 Paris, France
关键词
Mixture of experts; Likelihood ratio statistic test; Asymptotic statistic; LIKELIHOOD RATIO;
D O I
10.1016/j.neucom.2010.12.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The statistical properties of the likelihood ratio test statistic (LRTS) for mixture-of-expert models are addressed in this paper. This question is essential when estimating the number of experts in the model. Our purpose is to extend the existing results for simple mixture models (Liu and Shao, 2003 [8]) and mixtures of multilayer perceptrons (Olteanu and Rynkiewicz, 2008 [9]). In this paper we first study a simple example which embodies all the difficulties arising in such models. We find that in the most general case the LRTS diverges but, with additional assumptions, the behavior of such models can be totally explicated. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:1444 / 1449
页数:6
相关论文
共 50 条
  • [1] A Universal Approximation Theorem for Mixture-of-Experts Models
    Nguyen, Hien D.
    Lloyd-Jones, Luke R.
    McLachlan, Geoffrey J.
    [J]. NEURAL COMPUTATION, 2016, 28 (12) : 2585 - 2593
  • [2] Spatial Mixture-of-Experts
    Dryden, Nikoli
    Hoefler, Torsten
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
    Du, Nan
    Huang, Yanping
    Dai, Andrew M.
    Tong, Simon
    Lepikhin, Dmitry
    Xu, Yuanzhong
    Krikun, Maxim
    Zhou, Yanqi
    Yu, Adams Wei
    Firat, Orhan
    Zoph, Barret
    Fedus, Liam
    Bosma, Maarten
    Zhou, Zongwei
    Wang, Tao
    Wang, Yu Emma
    Webster, Kellie
    Pellat, Marie
    Robinson, Kevin
    Meier-Hellstern, Kathleen
    Duke, Toju
    Dixon, Lucas
    Zhang, Kun
    Le, Quoc V.
    Wu, Yonghui
    Chen, Zhifeng
    Cui, Claire
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [4] New estimation and feature selection methods in mixture-of-experts models
    Khalili, Abbas
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (04): : 519 - 539
  • [5] Hierarchical mixture-of-experts models for count variables with excessive zeros
    Park, Myung Hyun
    Kim, Joseph H. T.
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (12) : 4072 - 4096
  • [6] Adaptive mixture-of-experts models for data glove interface with multiple users
    Yoon, Jong-Won
    Yang, Sung-Ihk
    Cho, Sung-Bae
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 4898 - 4907
  • [7] Mixture-of-Experts with Expert Choice Routing
    Zhou, Yanqi
    Lei, Tao
    Liu, Hanxiao
    Du, Nan
    Huang, Yanping
    Zhao, Vincent Y.
    Dai, Andrew
    Chen, Zhifeng
    Le, Quoc
    Laudon, James
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [8] Janus: A Unified Distributed Training Framework for Sparse Mixture-of-Experts Models
    Liu, Juncai
    Wang, Jessie Hui
    Jiang, Yimin
    [J]. PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023, 2023, : 486 - 498
  • [9] Efficient Routing in Sparse Mixture-of-Experts
    [J]. Shamsolmoali, Pourya (pshams55@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [10] MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts
    Xie, Zhitian
    Zhang, Yinger
    Zhuang, Chenyi
    Shi, Qitao
    Liu, Zhining
    Gu, Jinjie
    Zhang, Guannan
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 16067 - 16075