Exploring a Unified Attention-Based Pooling Framework for Speaker Verification

被引:0
|
作者
Liu, Yi [1 ]
He, Liang [1 ]
Liu, Weiwei [2 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
[2] Chinese Peoples Liberat Army, 62315 Unit, Beijing 100842, Peoples R China
基金
中国国家自然科学基金;
关键词
speaker verification; speaker embedding; attention mechanism; multi-head attention;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pooling layer is an essential component in the neural network based speaker verification. Most of the current networks in speaker verification use average pooling to derive the utterance-level speaker representations. Average pooling takes every frame as equally important, which is suboptimal since the speaker-discriminant power is different between speech segments. In this paper, we present a unified attention-based pooling framework and combine it with the multi-head attention. Experiments on the Fisher and NIST SRE 2010 dataset show that involving outputs from lower layers to compute the attention weights can outperform average pooling and achieve better results than vanilla attention method. The multi-head attention further improves the performance.
引用
收藏
页码:200 / 204
页数:5
相关论文
共 50 条
  • [21] Ensemble Learning With Attention-Based Multiple Instance Pooling for Classification of SPT
    Zhou, Qinghua
    Zhang, Xin
    Zhang, Yu-Dong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (03) : 1927 - 1931
  • [22] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
    Xu, Haixia
    Huang, Yunjia
    Hancock, Edwin R.
    Wang, Shuailong
    Xuan, Qijun
    Zhou, Wei
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
  • [23] Blind image quality assessment via learnable attention-based pooling
    Gu, Jie
    Meng, Gaofeng
    Xiang, Shiming
    Pan, Chunhong
    PATTERN RECOGNITION, 2019, 91 : 332 - 344
  • [24] Modelling local and general quantum mechanical properties with attention-based pooling
    Buterez, David
    Janet, Jon Paul
    Kiddle, Steven J.
    Oglic, Dino
    Lio, Pietro
    COMMUNICATIONS CHEMISTRY, 2023, 6 (01)
  • [25] Modelling local and general quantum mechanical properties with attention-based pooling
    David Buterez
    Jon Paul Janet
    Steven J. Kiddle
    Dino Oglic
    Pietro Liò
    Communications Chemistry, 6
  • [26] A Framework for Attention-Based Personal Photo Manager
    Liao, Wen-Hung
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 2128 - 2132
  • [27] UNSUPERVISED SPEAKER ADAPTATION USING ATTENTION-BASED SPEAKER MEMORY FOR END-TO-END ASR
    Sari, Leda
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7384 - 7388
  • [28] Vector-Based Attentive Pooling for Text-Independent Speaker Verification
    Wu, Yanfeng
    Guo, Chenkai
    Gao, Hongcan
    Hou, Xiaolei
    Xu, Jing
    INTERSPEECH 2020, 2020, : 936 - 940
  • [29] Attention-Based Speaker Embeddings for One-Shot Voice Conversion
    Ishihara, Tatsuma
    Saito, Daisuke
    INTERSPEECH 2020, 2020, : 806 - 810
  • [30] Speaker Adaptation for Attention-Based End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2019, 2019, : 241 - 245