Speaker Verification Employing Combinations of Self-Attention Mechanisms

被引:5
|
作者
Bae, Ara [1 ]
Kim, Wooil [1 ]
机构
[1] Incheon Natl Univ, Dept Comp Sci & Engn, Incheon 22012, South Korea
关键词
speaker verification; self-attention; attention combinations;
D O I
10.3390/electronics9122201
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the most recent speaker recognition methods that demonstrates outstanding performance in noisy environments involves extracting the speaker embedding using attention mechanism instead of average or statistics pooling. In the attention method, the speaker recognition performance is improved by employing multiple heads rather than a single head. In this paper, we propose advanced methods to extract a new embedding by compensating for the disadvantages of the single-head and multi-head attention methods. The combination method comprising single-head and split-based multi-head attentions shows a 5.39% Equal Error Rate (EER). When the single-head and projection-based multi-head attention methods are combined, the speaker recognition performance improves by 4.45%, which is the best performance in this work. Our experimental results demonstrate that the attention mechanism reflects the speaker's properties more effectively than average or statistics pooling, and the speaker verification system could be further improved by employing combinations of different attention techniques.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [31] A Static Sign Language Recognition Method Enhanced with Self-Attention Mechanisms
    Wang, Yongxin
    Jiang, He
    Sun, Yutong
    Xu, Longqi
    SENSORS, 2024, 24 (21)
  • [32] TRILINGUAL SEMANTIC EMBEDDINGS OF VISUALLY GROUNDED SPEECH WITH SELF-ATTENTION MECHANISMS
    Ohishi, Yasunori
    Kimura, Akisato
    Kawanishi, Takahito
    Kashino, Kunio
    Harwath, David
    Glass, James
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4352 - 4356
  • [33] COMPARISON OF LOW COMPLEXITY SELF-ATTENTION MECHANISMS FOR ACOUSTIC EVENT DETECTION
    Komatsu, Tatsuya
    Scheibler, Robin
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1139 - 1143
  • [34] Speaker-Utterance Dual Attention for Speaker and Utterance Verification
    Liu, Tianchi
    Das, Rohan Kumar
    Madhavi, Maulik
    Shen, Shengmei
    Li, Haizhou
    INTERSPEECH 2020, 2020, : 4293 - 4297
  • [35] Design Resources Recommendation Based on Word Vectors and Self-Attention Mechanisms
    Sun Q.
    Deng C.
    Gu Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (01): : 63 - 72
  • [36] Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech
    Yoon, Hyungchan
    Kim, Changhwan
    Song, Eunwoo
    Yoon, Hyun-Wook
    Kang, Hong-Goo
    INTERSPEECH 2023, 2023, : 4299 - 4303
  • [37] Self-Attention for Cyberbullying Detection
    Pradhan, Ankit
    Yatam, Venu Madhav
    Bera, Padmalochan
    2020 INTERNATIONAL CONFERENCE ON CYBER SITUATIONAL AWARENESS, DATA ANALYTICS AND ASSESSMENT (CYBER SA 2020), 2020,
  • [38] On the Integration of Self-Attention and Convolution
    Pan, Xuran
    Ge, Chunjiang
    Lu, Rui
    Song, Shiji
    Chen, Guanfu
    Huang, Zeyi
    Huang, Gao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 805 - 815
  • [39] On The Computational Complexity of Self-Attention
    Keles, Feyza Duman
    Wijewardena, Pruthuvi Mahesakya
    Hegde, Chinmay
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 597 - 619
  • [40] The Lipschitz Constant of Self-Attention
    Kim, Hyunjik
    Papamakarios, George
    Mnih, Andriy
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139