Speaker Verification Employing Combinations of Self-Attention Mechanisms

被引：5

作者：

Bae, Ara ^{[1
]}

Kim, Wooil ^{[1
]}

机构：

[1] Incheon Natl Univ, Dept Comp Sci & Engn, Incheon 22012, South Korea

来源：

ELECTRONICS | 2020年 / 9卷 / 12期

关键词：

speaker verification; self-attention; attention combinations;

D O I：

10.3390/electronics9122201

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

One of the most recent speaker recognition methods that demonstrates outstanding performance in noisy environments involves extracting the speaker embedding using attention mechanism instead of average or statistics pooling. In the attention method, the speaker recognition performance is improved by employing multiple heads rather than a single head. In this paper, we propose advanced methods to extract a new embedding by compensating for the disadvantages of the single-head and multi-head attention methods. The combination method comprising single-head and split-based multi-head attentions shows a 5.39% Equal Error Rate (EER). When the single-head and projection-based multi-head attention methods are combined, the speaker recognition performance improves by 4.45%, which is the best performance in this work. Our experimental results demonstrate that the attention mechanism reflects the speaker's properties more effectively than average or statistics pooling, and the speaker verification system could be further improved by employing combinations of different attention techniques.

引用

页码：1 / 11

页数：11

共 50 条

[31] A Static Sign Language Recognition Method Enhanced with Self-Attention Mechanisms
Wang, Yongxin
Jiang, He
Sun, Yutong
Xu, Longqi
SENSORS, 2024, 24 (21)
[32] TRILINGUAL SEMANTIC EMBEDDINGS OF VISUALLY GROUNDED SPEECH WITH SELF-ATTENTION MECHANISMS
Ohishi, Yasunori
Kimura, Akisato
Kawanishi, Takahito
Kashino, Kunio
Harwath, David
Glass, James
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4352 - 4356
[33] COMPARISON OF LOW COMPLEXITY SELF-ATTENTION MECHANISMS FOR ACOUSTIC EVENT DETECTION
Komatsu, Tatsuya
Scheibler, Robin
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1139 - 1143
[34] Speaker-Utterance Dual Attention for Speaker and Utterance Verification
Liu, Tianchi
Das, Rohan Kumar
Madhavi, Maulik
Shen, Shengmei
Li, Haizhou
INTERSPEECH 2020, 2020, : 4293 - 4297
[35] Design Resources Recommendation Based on Word Vectors and Self-Attention Mechanisms
Sun Q.
Deng C.
Gu Z.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (01): : 63 - 72
[36] Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech
Yoon, Hyungchan
Kim, Changhwan
Song, Eunwoo
Yoon, Hyun-Wook
Kang, Hong-Goo
INTERSPEECH 2023, 2023, : 4299 - 4303
[37] Self-Attention for Cyberbullying Detection
Pradhan, Ankit
Yatam, Venu Madhav
Bera, Padmalochan
2020 INTERNATIONAL CONFERENCE ON CYBER SITUATIONAL AWARENESS, DATA ANALYTICS AND ASSESSMENT (CYBER SA 2020), 2020,
[38] On the Integration of Self-Attention and Convolution
Pan, Xuran
Ge, Chunjiang
Lu, Rui
Song, Shiji
Chen, Guanfu
Huang, Zeyi
Huang, Gao
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 805 - 815
[39] On The Computational Complexity of Self-Attention
Keles, Feyza Duman
Wijewardena, Pruthuvi Mahesakya
Hegde, Chinmay
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 597 - 619
[40] The Lipschitz Constant of Self-Attention
Kim, Hyunjik
Papamakarios, George
Mnih, Andriy
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139

← 1 2 3 4 5 →