Spatial Mixture-of-Experts

被引:0
|
作者
Dryden, Nikoli [1 ]
Hoefler, Torsten [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many data have an underlying dependence on spatial location; it may be weather on the Earth, a simulation on a mesh, or a registered image. Yet this feature is rarely taken advantage of, and violates common assumptions made by many neural network layers, such as translation equivariance. Further, many works that do incorporate locality fail to capture fine-grained structure. To address this, we introduce the Spatial Mixture-of-Experts (SMOE) layer, a sparsely-gated layer that learns spatial structure in the input domain and routes experts at a fine-grained level to utilize it. We also develop new techniques to train SMOEs, including a self-supervised routing loss and damping expert errors. Finally, we show strong results for SMOEs on numerous tasks, and set new state-of-the-art results for medium-range weather prediction and post-processing ensemble weather forecasts.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] A Mixture-of-Experts Prediction Framework for Evolutionary Dynamic Multiobjective Optimization
    Rambabu, Rethnaraj
    Vadakkepat, Prahlad
    Tan, Kay Chen
    Jiang, Min
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (12) : 5099 - 5112
  • [32] Phase-type mixture-of-experts regression for loss severities
    Bladt, Martin
    Yslas, Jorge
    [J]. SCANDINAVIAN ACTUARIAL JOURNAL, 2023, 2023 (04) : 303 - 329
  • [33] ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
    Zhao, Yue
    Zheng, Guoqing
    Mukherjee, Subhabrata
    McCann, Robert
    Awadallah, Ahmed
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 4937 - 4945
  • [34] Hierarchical mixture-of-experts models for count variables with excessive zeros
    Park, Myung Hyun
    Kim, Joseph H. T.
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (12) : 4072 - 4096
  • [35] A MULTIMODAL MIXTURE-OF-EXPERTS MODEL FOR DYNAMIC EMOTION PREDICTION IN MOVIES
    Goyal, Ankit
    Kumar, Naveen
    Guha, Tanaya
    Narayanan, Shrikanth S.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 2822 - 2826
  • [36] Distributed Mixture-of-Experts for Big Data using PETUUM framework
    Peralta, Billy
    Parra, Luis
    Herrera, Oriel
    Caro, Luis
    [J]. 2017 36TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2017,
  • [37] Extension of mixture-of-experts networks for binary classification of hierarchical data
    Ng, Shu-Kay
    McLachlan, Geoffrey J.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2007, 41 (01) : 57 - 67
  • [38] Enhancing Mixture-of-Experts by Leveraging Attention for Fine-Grained Recognition
    Zhang, Lianbo
    Huang, Shaoli
    Liu, Wei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4409 - 4421
  • [39] Adaptive mixture-of-experts models for data glove interface with multiple users
    Yoon, Jong-Won
    Yang, Sung-Ihk
    Cho, Sung-Bae
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 4898 - 4907
  • [40] DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets
    Jain, Yash
    Behl, Harkirat
    Kira, Zsolt
    Vineet, Vibhav
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,