Speaker verification using attentive multi-scale convolutional recurrent network

被引:5
|
作者
Li, Yanxiong [1 ,2 ]
Jiang, Zhongjie [1 ]
Cao, Wenchang [1 ]
Huang, Qisheng [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Peoples R China
[2] South China Univ Technol, Sch Elect & Informat Engn, Room 303,Qingqing Arts & Sci Bldg Bldg 37,381 Wush, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Speaker embedding; Speaker verification; Attentive mechanism; Multi-scale convolutional recurrent; network; Dilated convolution; NEURAL-NETWORK; RESIDUAL NETWORKS; RECOGNITION; SINGLE;
D O I
10.1016/j.asoc.2022.109291
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a speaker verification method by an Attentive Multi-scale Convolutional Recurrent Network (AMCRN). The proposed AMCRN can acquire both local spatial information and global sequential information from the input speech recordings. In the proposed method, logarithm Mel spectrum is extracted from each speech recording and then fed to the proposed AMCRN for learning speaker embedding. Afterwards, the learned speaker embedding is fed to the back-end classifier (such as cosine similarity metric) for scoring in the testing stage. The proposed method is compared with state-of-the-art methods for speaker verification. Experimental data are three public datasets that are selected from two large-scale speech corpora (VoxCeleb1 and VoxCeleb2). Experimental results show that our method exceeds baseline methods in terms of equal error rate and minimal detection cost function, and has advantages over most of baseline methods in terms of computational complexity and memory requirement. In addition, our method generalizes well across truncated speech segments with different durations, and the speaker embedding learned by the proposed AMCRN has stronger generalization ability across two back-end classifiers.(C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A multi-scale attentive recurrent network for image dehazing
    Wang, Yibin
    Yin, Shibai
    Basu, Anup
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (21-23) : 32539 - 32565
  • [2] A multi-scale attentive recurrent network for image dehazing
    Yibin Wang
    Shibai Yin
    Anup Basu
    [J]. Multimedia Tools and Applications, 2021, 80 : 32539 - 32565
  • [3] Knee osteoarthritis severity prediction using an attentive multi-scale deep convolutional neural network
    Rohit Kumar Jain
    Prasen Kumar Sharma
    Sibaji Gaj
    Arijit Sur
    Palash Ghosh
    [J]. Multimedia Tools and Applications, 2024, 83 : 6925 - 6942
  • [4] Knee osteoarthritis severity prediction using an attentive multi-scale deep convolutional neural network
    Jain, Rohit Kumar
    Sharma, Prasen Kumar
    Gaj, Sibaji
    Sur, Arijit
    Ghosh, Palash
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 6925 - 6942
  • [5] Multi-scale attention graph convolutional recurrent network for traffic forecasting
    Xiong, Liyan
    Hu, Zhuyi
    Yuan, Xinhua
    Ding, Weihua
    Huang, Xiaohui
    Lan, Yuanchun
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (03): : 3277 - 3291
  • [6] SI-NET: MULTI-SCALE CONTEXT-AWARE CONVOLUTIONAL BLOCK FOR SPEAKER VERIFICATION
    Li, Zhuo
    Fang, Ce
    Xiao, Runqiu
    Wang, Wenchao
    Yan, Yonghong
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 220 - 227
  • [7] TMS: Temporal multi-scale in time-delay neural network for speaker verification
    Zhang, Ruiteng
    Wei, Jianguo
    Lu, Xugang
    Lu, Wenhuan
    Jin, Di
    Zhang, Lin
    Xu, Junhai
    Dang, Jianwu
    [J]. APPLIED INTELLIGENCE, 2023, 53 (22) : 26497 - 26517
  • [8] TMS: Temporal multi-scale in time-delay neural network for speaker verification
    Ruiteng Zhang
    Jianguo Wei
    Xugang Lu
    Wenhuan Lu
    Di Jin
    Lin Zhang
    Junhai Xu
    Jianwu Dang
    [J]. Applied Intelligence, 2023, 53 : 26497 - 26517
  • [9] Antimicrobial peptide identification using multi-scale convolutional network
    Xin Su
    Jing Xu
    Yanbin Yin
    Xiongwen Quan
    Han Zhang
    [J]. BMC Bioinformatics, 20
  • [10] Antimicrobial peptide identification using multi-scale convolutional network
    Su, Xin
    Xu, Jing
    Yin, Yanbin
    Quan, Xiongwen
    Zhang, Han
    [J]. BMC BIOINFORMATICS, 2019, 20 (01)