共 38 条
- [1] Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership [J]. Advances in Data Analysis and Classification, 2019, 13 : 1019 - 1051
- [2] Asymptotic properties of mixture-of-experts models [J]. NEUROCOMPUTING, 2011, 74 (09) : 1444 - 1449
- [5] A Universal Approximation Theorem for Mixture-of-Experts Models [J]. NEURAL COMPUTATION, 2016, 28 (12) : 2585 - 2593
- [6] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [7] New estimation and feature selection methods in mixture-of-experts models [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (04): : 519 - 539
- [10] Janus: A Unified Distributed Training Framework for Sparse Mixture-of-Experts Models [J]. PROCEEDINGS OF THE 2023 ACM SIGCOMM 2023 CONFERENCE, SIGCOMM 2023, 2023, : 486 - 498