AutoSpeech: Neural Architecture Search for Speaker Recognition

被引:13
|
作者
Ding, Shaojin [1 ]
Chen, Tianlong [2 ]
Gong, Xinyu [1 ,2 ]
Zha, Weiwei [3 ]
Wang, Zhangyang [2 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Engn, College Stn, TX 77843 USA
[2] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[3] Univ Sci & Technol China, Sch Software Engn, Beijing, Peoples R China
来源
关键词
speaker recognition; neural architecture search;
D O I
10.21437/Interspeech.2020-1258
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet. However, these backbones were originally proposed for image classification, and therefore may not be naturally fit for speaker recognition. Due to the prohibitive complexity of manually exploring the design space, we propose the first neural architecture search approach for the speaker recognition tasks, named as AutoSpeech. Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times. The final speaker recognition model can be obtained by training the derived CNN model through the standard scheme. To evaluate the proposed approach, we conduct experiments on both speaker identification and speaker verification tasks using the VoxCeleb1 dataset. Results demonstrate that the derived CNN architectures from the proposed approach significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 backbones, while enjoying lower model complexity.
引用
收藏
页码:916 / 920
页数:5
相关论文
共 50 条
  • [1] EfficientTDNN: Efficient Architecture Search for Speaker Recognition
    Wang, Rui
    Wei, Zhihua
    Duan, Haoran
    Ji, Shouling
    Long, Yang
    Hong, Zhen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2267 - 2279
  • [2] Efficient neural architecture search for emotion recognition
    Verma, Monu
    Mandal, Murari
    Reddy, Satish Kumar
    Meedimale, Yashwanth Reddy
    Vipparthi, Santosh Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 224
  • [3] NEURAL ARCHITECTURE SEARCH FOR SPEECH EMOTION RECOGNITION
    Wu, Xixin
    Hu, Shoukang
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6902 - 6906
  • [4] Neural Architecture Search for Lightweight Neural Network in Food Recognition
    Tan, Ren Zhang
    Chew, XinYing
    Khaw, Khai Wah
    MATHEMATICS, 2021, 9 (11)
  • [5] AutoMER: Spatiotemporal Neural Architecture Search for Microexpression Recognition
    Verma, Monu
    Reddy, M. Satish Kumar
    Meedimale, Yashwanth Reddy
    Mandal, Murari
    Vipparthi, Santosh Kumar
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6116 - 6128
  • [6] Evolutionary Neural Architecture Search for Facial Expression Recognition
    Deng, Shuchao
    Lv, Zeqiong
    Galvan, Edgar
    Sun, Yanan
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (05): : 1405 - 1419
  • [7] Binarized Neural Architecture Search for Efficient Object Recognition
    Hanlin Chen
    Li’an Zhuo
    Baochang Zhang
    Xiawu Zheng
    Jianzhuang Liu
    Rongrong Ji
    David Doermann
    Guodong Guo
    International Journal of Computer Vision, 2021, 129 : 501 - 516
  • [8] Automatic Modulation Recognition Using Neural Architecture Search
    Wei, Shengyun
    Zou, Shun
    Liao, Feifan
    Lang, Weimin
    Wu, Wenhui
    2019 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE BIG DATA AND INTELLIGENT SYSTEMS (HPBD&IS), 2019, : 151 - 156
  • [9] Teacher Guided Neural Architecture Search for Face Recognition
    Wang, Xiaobo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2817 - 2825
  • [10] Binarized Neural Architecture Search for Efficient Object Recognition
    Chen, Hanlin
    Zhuo, Li'an
    Zhang, Baochang
    Zheng, Xiawu
    Liu, Jianzhuang
    Ji, Rongrong
    Doermann, David
    Guo, Guodong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (02) : 501 - 516