Neural Acoustic-Phonetic Approach for Speaker Verification With Phonetic Attention Mask

被引:7
|
作者
Liu, Tianchi [1 ,2 ]
Das, Rohan Kumar [3 ]
Lee, Kong Aik [1 ]
Li, Haizhou [2 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117583, Singapore
[3] Fortemedia, Singapore 138589, Singapore
基金
新加坡国家研究基金会;
关键词
Phonetics; Training; Mel frequency cepstral coefficient; Generators; Speech recognition; Task analysis; Databases; Speaker verification; text-dependent; attention; masking; phonetic information; prompted digit recognition; RECOGNITION;
D O I
10.1109/LSP.2022.3143036
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Traditional acoustic-phonetic approach makes use of both spectral and phonetic information when comparing the voice of speakers. While phonetic units are not equally informative, the phonetic context of speech plays an important role in speaker verification (SV). In this paper, we propose a neural acoustic-phonetic approach that learns to dynamically assign differentiated weights to spectral features for SV. Such differentiated weights form a phonetic attention mask (PAM). The neural acoustic-phonetic framework consists of two training pipelines, one for SV and another for speech recognition. Through the PAM, we leverage the phonetic information for SV. We evaluate the proposed neural acoustic-phonetic framework on the RSR2015 database Part III corpus, that consists of random digit strings. We show that the proposed framework with PAM consistently outperforms baseline with an equal error rate reduction of 13.45% and 10.20% for female and male data, respectively.
引用
收藏
页码:782 / 786
页数:5
相关论文
共 50 条
  • [21] ACOUSTIC-PHONETIC ISSUES IN SPEECH-PERCEPTION
    SAMUEL, AG
    TARTTER, VC
    ANNUAL REVIEW OF ANTHROPOLOGY, 1986, 15 : 247 - 273
  • [22] Acoustic-phonetic features for the automatic classification of fricatives
    Abdelatty Ali, Ahmed M.
    Van Der Spiegel, Jan
    Mueller, Paul
    1600, Acoustical Society of America (109):
  • [23] Acoustic-phonetic features for the automatic classification of fricatives
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (05): : 2217 - 2235
  • [24] A fuzzy acoustic-phonetic decoder for speech recognition
    Oppizzi, O
    Fournier, D
    Gilles, P
    Meloni, H
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2270 - 2273
  • [25] SYSTEM FOR ACOUSTIC-PHONETIC ANALYSIS OF CONTINUOUS SPEECH
    WEINSTEIN, CJ
    MCCANDLESS, SS
    MONDSHEIN, LF
    ZUE, VW
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1975, AS23 (01): : 54 - 67
  • [26] Acoustic-Phonetic Analysis for Speech Recognition: A Review
    Sarma, Biswajit Dev
    Prasanna, S. R. Mahadeva
    IETE TECHNICAL REVIEW, 2018, 35 (03) : 305 - 327
  • [27] Introducing phonetic information to speaker embedding for speaker verification
    Yi Liu
    Liang He
    Jia Liu
    Michael T. Johnson
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [28] Introducing phonetic information to speaker embedding for speaker verification
    Liu, Yi
    He, Liang
    Liu, Jia
    Johnson, Michael T.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [29] Acoustic-Phonetic Features for Refining the Explicit Speech Segmentation
    Selmini, Antonio Marcos
    Violaro, Fabio
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1853 - 1856
  • [30] ACOUSTIC-PHONETIC EXPERIMENT FACILITY FOR STUDY OF CONTINUOUS SPEECH
    SCHWARTZ, RM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1975, 58 : S105 - S105