Asymptotic properties of mixture-of-experts models

被引：3

作者：

Olteanu, M. ^{[1
]}

Rynkiewicz, J. ^{[1
]}

机构：

[1] Univ Paris 01, SAMM, EA 4543, F-75013 Paris, France

来源：

NEUROCOMPUTING | 2011年 / 74卷 / 09期

关键词：

Mixture of experts; Likelihood ratio statistic test; Asymptotic statistic; LIKELIHOOD RATIO;

D O I：

10.1016/j.neucom.2010.12.007

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The statistical properties of the likelihood ratio test statistic (LRTS) for mixture-of-expert models are addressed in this paper. This question is essential when estimating the number of experts in the model. Our purpose is to extend the existing results for simple mixture models (Liu and Shao, 2003 [8]) and mixtures of multilayer perceptrons (Olteanu and Rynkiewicz, 2008 [9]). In this paper we first study a simple example which embodies all the difficulties arising in such models. We find that in the most general case the LRTS diverges but, with additional assumptions, the behavior of such models can be totally explicated. (C) 2011 Elsevier B.V. All rights reserved.

引用

页码：1444 / 1449

页数：6

共 50 条

[1] A Universal Approximation Theorem for Mixture-of-Experts Models
Nguyen, Hien D.
Lloyd-Jones, Luke R.
McLachlan, Geoffrey J.
[J]. NEURAL COMPUTATION, 2016, 28 (12) : 2585 - 2593
[2] Spatial Mixture-of-Experts
Dryden, Nikoli
Hoefler, Torsten
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Du, Nan
Huang, Yanping
Dai, Andrew M.
Tong, Simon
Lepikhin, Dmitry
Xu, Yuanzhong
Krikun, Maxim
Zhou, Yanqi
Yu, Adams Wei
Firat, Orhan
Zoph, Barret
Fedus, Liam
Bosma, Maarten
Zhou, Zongwei
Wang, Tao
Wang, Yu Emma
Webster, Kellie
Pellat, Marie
Robinson, Kevin
Meier-Hellstern, Kathleen
Duke, Toju
Dixon, Lucas
Zhang, Kun
Le, Quoc V.
Wu, Yonghui
Chen, Zhifeng
Cui, Claire
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[4] New estimation and feature selection methods in mixture-of-experts models
Khalili, Abbas
[J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (04): : 519 - 539
[5] Hierarchical mixture-of-experts models for count variables with excessive zeros
Park, Myung Hyun
Kim, Joseph H. T.
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (12) : 4072 - 4096
[6] Adaptive mixture-of-experts models for data glove interface with multiple users
Yoon, Jong-Won
Yang, Sung-Ihk
Cho, Sung-Bae
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 4898 - 4907
[7] Mixture-of-Experts with Expert Choice Routing
Zhou, Yanqi
Lei, Tao
Liu, Hanxiao
Du, Nan
Huang, Yanping
Zhao, Vincent Y.
Dai, Andrew
Chen, Zhifeng
Le, Quoc
Laudon, James
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[8] Janus: A Unified Distributed Training Framework for Sparse Mixture-of-Experts Models
Liu, Juncai
Wang, Jessie Hui
Jiang, Yimin
[J]. PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023, 2023, : 486 - 498
[9] Efficient Routing in Sparse Mixture-of-Experts
[J]. Shamsolmoali, Pourya (pshams55@gmail.com), 1600, Institute of Electrical and Electronics Engineers Inc.
[10] MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts
Xie, Zhitian
Zhang, Yinger
Zhuang, Chenyi
Shi, Qitao
Liu, Zhining
Gu, Jinjie
Zhang, Guannan
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 16067 - 16075

← 1 2 3 4 5 →