Self-Attention Encoding and Pooling for Speaker Recognition

被引：35

作者：

Safari, Pooyan ^{[1
]}

India, Miquel ^{[1
]}

Hernando, Javier ^{[1
]}

机构：

[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain

来源：

INTERSPEECH 2020 | 2020年

关键词：

Self-Attention Encoding; Self-Attention Pooling; Speaker Verification; Speaker Embedding;

D O I：

10.21437/Interspeech.2020-1446

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

The computing power of mobile devices limits the end-user applications in terms of storage size, processing, memory and energy consumption. These limitations motivate researchers for the design of more efficient deep models. On the other hand, self-attention networks based on Transformer architecture have attracted remarkable interests due to their high parallelization capabilities and strong performance on a variety of Natural Language Processing (NLP) applications. Inspired by the Transformer, we propose a tandem Self-Attention Encoding and Pooling (SAEP) mechanism to obtain a discriminative speaker embedding given non-fixed length speech utterances. SAEP is a stack of identical blocks solely relied on self-attention and position-wise feed-forward networks to create vector representation of speakers. This approach encodes short-term speaker spectral features into speaker embeddings to be used in text-independent speaker verification. We have evaluated this approach on both VoxCeleb1 & 2 datasets. The proposed architecture is able to outperform the baseline x-vector, and shows competitive performance to some other benchmarks based on convolutions, with a significant reduction in model size. It employs 94%, 95%, and 73% less parameters compared to ResNet-34, ResNet-50, and x-vector, respectively. This indicates that the proposed fully attention based architecture is more efficient in extracting time-invariant features from speaker utterances.

引用

页码：941 / 945

页数：5

共 50 条

[41] UniFormer: Unifying Convolution and Self-Attention for Visual Recognition
Li, Kunchang
Wang, Yali
Zhang, Junhao
Gao, Peng
Song, Guanglu
Liu, Yu
Li, Hongsheng
Qiao, Yu
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12581 - 12600
[42] Self Multi-Head Attention for Speaker Recognition
India, Miquel
Safari, Pooyan
Hernando, Javier
INTERSPEECH 2019, 2019, : 4305 - 4309
[43] GCNSA: DNA storage encoding with a graph convolutional network and self-attention
Cao, Ben
Wang, Bin
Zhang, Qiang
ISCIENCE, 2023, 26 (03)
[44] Progressive Self-Attention Network with Unsymmetrical Positional Encoding for Sequential Recommendation
Zhu, Yuehua
Huang, Bo
Jiang, Shaohua
Yang, Muli
Yang, Yanhua
Zhong, Wenliang
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2029 - 2033
[45] Phrase-level Self-Attention Networks for Universal Sentence Encoding
Wu, Wei
Wang, Houfeng
Liu, Tianyu
Ma, Shuming
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3729 - 3738
[46] Video person re-identification with global statistic pooling and self-attention distillation
Lin, Gaojie
Zhao, Sanyuan
Shen, Jianbing
NEUROCOMPUTING, 2021, 453 (453) : 777 - 789
[47] An Aerial Target Recognition Algorithm Based on Self-Attention and LSTM
Liang, Futai
Chen, Xin
He, Song
Song, Zihao
Lu, Hao
CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 81 (01): : 1101 - 1121
[48] Pedestrian Attribute Recognition Based on Dual Self-attention Mechanism
Fan, Zhongkui
Guan, Ye-peng
COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (02) : 793 - 812
[49] Using Self-Attention LSTMs to Enhance Observations in Goal Recognition
Amado, Leonardo
Licks, Gabriel Paludo
Marcon, Matheus
Pereira, Ramon Fraga
Meneguzzi, Felipe
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[50] Neural Named Entity Recognition Using a Self-Attention Mechanism
Zukov-Gregoric, Andrej
Bachrach, Yoram
Minkovsky, Pasha
Coope, Sam
Maksak, Bogdan
2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 652 - 656

← 1 2 3 4 5 →