共 50 条
- [1] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [4] A mixture-of-experts framework for adaptive Kalman filtering IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1997, 27 (03): : 452 - 464
- [5] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 6159 - 6172
- [7] On the Benefits of Learning to Route in Mixture-of-Experts Models 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 9376 - 9396
- [9] Spatial Mixture-of-Experts ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,