ECAPA plus plus : Fine-grained Deep Embedding Learning for TDNN Based Speaker Verification

被引:1
|
作者
Liu, Bei [1 ]
Qian, Yanmin [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, X LANCE Lab, MoE Key Lab Artificial Intelligence,AI Inst, Shanghai, Peoples R China
来源
关键词
speaker verification; time-delay neural network; ECAPA; ResNet; system fusion; RECOGNITION;
D O I
10.21437/Interspeech.2023-777
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we aim to bridge the performance gap between TDNN and 2D CNN based speaker verification systems. Specifically, three types of architectural enhancements to ECAPA-TDNN are proposed: 1) follow depth-first design to significantly increase network depth while maintaining its complexity. 2) introduce recursive convolution to better capture fine-grained speaker information. 3) propose pyramid-based multi-path feature enhancement module to yield more discriminative speaker representation. Experiments on Voxceleb show that our final model, named ECAPA++, achieves 25%, 23% and 24% relative improvements on Vox1-O, E and H respectively, while with 2.4x fewer parameters and 2.3x fewer FLOPs over the previous best TDNN-based system. Meanwhile, it is comparable to the state-of-the-art ResNet-based systems with higher computational efficiency. In addition, further performance gains can be achieved by fusing ECAPA++ and ResNetbased systems.
引用
收藏
页码:3132 / 3136
页数:5
相关论文
共 50 条
  • [1] DFR-ECAPA: Diffusion Feature Refinement for Speaker Verification Based on ECAPA-TDNN
    Gao, Ya
    Song, Wei
    Zhao, Xiaobing
    Liu, Xiangchun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 457 - 468
  • [2] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
    Desplanques, Brecht
    Thienpondt, Jenthe
    Demuynck, Kris
    INTERSPEECH 2020, 2020, : 3830 - 3834
  • [3] Fine-Grained Early Frequency Attention for Deep Speaker Representation Learning
    Hajavi A.
    Etemad A.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1413 - 1425
  • [4] IC-ChipNet: Deep Embedding Learning for Fine-grained Retrieval, Recognition, and Verification of Microelectronic Images
    Reza, Md Alimoor
    Crandall, David J.
    2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
  • [5] Fine-Grained Visual Computing Based on Deep Learning
    Lv, Zhihan
    Qiao, Liang
    Singh, Amit Kumar
    Wang, Qingjun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [6] BLEU plus : A Tool for Fine-Grained BLEU Computation
    Tantug, A. Cuneyd
    Oflazer, Kemal
    El-Kahlout, Ilknur D.
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1493 - 1499
  • [7] Learning Fine-Grained Motion Embedding for Landscape Animation
    Xue, Hongwei
    Liu, Bei
    Yang, Huan
    Fu, Jianlong
    Li, Houqiang
    Luo, Jiebo
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 291 - 299
  • [8] Fine-grained Early Frequency Attention for Deep Speaker Recognition
    Hajavi, Amirhossein
    Etemad, Ali
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [9] Fine-Grained Classification of Hyperspectral Imagery Based on Deep Learning
    Chen, Yushi
    Huang, Lingbo
    Zhu, Lin
    Yokoya, Naoto
    Jia, Xiuping
    REMOTE SENSING, 2019, 11 (22)
  • [10] A Survey of Fine-Grained Visual Categorization Based on Deep Learning
    Xie, Yuxiang
    Gong, Quanzhi
    Luan, Xidao
    Yan, Jie
    Zhang, Jiahui
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2024, 35 (06) : 1337 - 1356