Articulatory movement features for short-duration text-dependent speaker verification

被引:5
|
作者
Zhang Y. [2 ]
Long Y. [2 ]
Shen X. [1 ]
Wei H. [2 ]
Yang M. [2 ]
Ye H. [2 ]
Mao H. [2 ]
机构
[1] College of Humanities and Communications, Shanghai Normal University, Shanghai
[2] Department of Electronical and Information Engineering, Shanghai Normal University, Shanghai
关键词
Articulatory movement features; Dynamic time warping; Speaker verification; Text-dependent;
D O I
10.1007/s10772-017-9447-8
中图分类号
学科分类号
摘要
During our pronunciation process, the position and movement properties of articulators such as tongue, jaw, lips, etc are mainly captured by the articulatory movement features (AMFs). This paper investigates to use the AMFs for short-duration text-dependent speaker verification. The AMFs can characterize the relative motion trajectory of articulators of individual speakers directly, which is rarely affected by the external environment. Therefore, we expect that, the AMFs are superior to the traditional acoustic features, such as mel-frequency cepstral coefficients (MFCC), to characterize the speaker identity differences between speakers. The speaker similarity scores measured by the dynamic time warping (DTW) algorithm are used to make the speaker verification decisions. Experimental results show that the AMFs can bring significant performance gains over the traditional MFCC features for short-duration text-dependent speaker verification task. © 2017, Springer Science+Business Media, LLC.
引用
收藏
页码:753 / 759
页数:6
相关论文
共 50 条
  • [41] Addressing Text-Dependent Speaker Verification Using Singing Speech
    Shi, Yan
    Zhou, Juanjuan
    Long, Yanhua
    Li, Yijie
    Mao, Hongwei
    APPLIED SCIENCES-BASEL, 2019, 9 (13):
  • [42] EFFECTS OF GENDER INFORMATION IN TEXT-INDEPENDENT AND TEXT-DEPENDENT SPEAKER VERIFICATION
    Kanervisto, Anssi
    Vestman, Ville
    Sahidullah, Md
    Hautamaki, Ville
    Kinnunen, Tomi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5360 - 5364
  • [43] Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
    Sarkar, Achintya Kumar
    Tan, Zheng-Hua
    Tang, Hao
    Shon, Suwon
    Glass, James
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (08) : 1267 - 1279
  • [44] Exploring single channel speech separation for short-time text-dependent speaker verification
    Jiangyu Han
    Yan Shi
    Yanhua Long
    Jiaen Liang
    International Journal of Speech Technology, 2022, 25 : 261 - 268
  • [45] Exploring single channel speech separation for short-time text-dependent speaker verification
    Han, Jiangyu
    Shi, Yan
    Long, Yanhua
    Liang, Jiaen
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (01) : 261 - 268
  • [46] Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification
    Thomsen, Nicolai Baek
    Thomsen, Dennis Alexander Lehmann
    Tan, Zheng-Hua
    Lindberg, Borge
    Jensen, Soren Holdt
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1839 - 1843
  • [47] Weighting scores to improve speaker-dependent threshold estimation in text-dependent speaker verification
    Saeta, JR
    Hernando, J
    NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 81 - 91
  • [48] PHONETICALLY-CONSTRAINED PLDA MODELING FOR TEXT-DEPENDENT SPEAKER VERIFICATION WITH MULTIPLE SHORT UTTERANCES
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7673 - 7677
  • [49] Cepstral Features and Text-Dependent Speaker Identification A Comparative Study
    Ouzounov, Atanas
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2010, 10 (01) : 3 - 12
  • [50] Parameterization of the score threshold for a text-dependent adaptive speaker verification system
    Mirghafori, N
    Hébert, M
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 361 - 364