End Point Detection Using Speech-Specific Knowledge for Text-Dependent Speaker Verification

被引:2
|
作者
Bhukya, Ramesh K. [1 ]
Sarma, Biswajit Dev [1 ]
Prasanna, S. R. Mahadeva [1 ]
机构
[1] Indian Inst Technol Guwahati, Electro Med & Speech Technol Lab, Dept Elect & Elect Engn, Gauhati 781039, India
关键词
End point detection; Vowel-like region; Glottal activity; Dominant resonant frequency; Foreground speech segmentation; Speech duration knowledge; RECOGNITION; VOWEL; ALGORITHM; SPECTRUM;
D O I
10.1007/s00034-018-0827-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a method using speech-specific knowledge to detect the begin and end points of speech under degraded condition. The method is based on vowel-like region (VLR) detection and uses both excitation source and vocal tract system information. Existing method for VLR detection uses excitation source information. Vocal tract system information from dominant resonant frequency is used to eliminate spurious VLRs in background noise. Foreground speech segmentation using excitation and vocal tract system information is carried out to remove spurious VLRs in the background speech region. Better localization of the end points is done using more detailed information about excitation source in terms of glottal activity to detect the sonorant consonants and missed VLRs. To include an unvoiced consonant, obstruent region detection is done at the beginning of the first VLR and at the end of last VLR. Detected begin and end points are evaluated by comparing with manually marked end points as well as by conducting the text-dependent speaker verification experiments. The proposed method performs better than some of the existing techniques.
引用
收藏
页码:5507 / 5539
页数:33
相关论文
共 50 条
  • [1] End Point Detection Using Speech-Specific Knowledge for Text-Dependent Speaker Verification
    Ramesh K. Bhukya
    Biswajit Dev Sarma
    S. R. Mahadeva Prasanna
    [J]. Circuits, Systems, and Signal Processing, 2018, 37 : 5507 - 5539
  • [2] Addressing Text-Dependent Speaker Verification Using Singing Speech
    Shi, Yan
    Zhou, Juanjuan
    Long, Yanhua
    Li, Yijie
    Mao, Hongwei
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (13):
  • [3] End-to-End Text-Dependent Speaker Verification
    Heigold, Georg
    Moreno, Ignacio
    Bengio, Samy
    Shazeer, Noam
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5115 - 5119
  • [4] Towards Goat Detection in Text-Dependent Speaker Verification
    Toledo-Ronen, Orith
    Aronowitz, Hagai
    Hoory, Ron
    Pelecanos, Jason
    Nahamoo, David
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 16 - +
  • [5] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [6] End-to-end text-dependent speaker verification using novel distance measures
    Dey, Subhadeep
    Madikeri, Srikanth
    Motlicek, Petr
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3598 - 3602
  • [7] Using Phoneme Recognition and Text-dependent Speaker Verification to Improve Speaker Segmentation for Chinese Speech
    Wang, Gang
    Wu, Xiaojun
    Zheng, Thomas Fang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1457 - 1460
  • [8] Text-dependent speaker verification system
    Qin, Bing
    Chen, Huipeng
    Li, Guangqi
    Liu, Songbo
    [J]. Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2000, 32 (04): : 16 - 18
  • [9] Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification
    Thomsen, Nicolai Baek
    Thomsen, Dennis Alexander Lehmann
    Tan, Zheng-Hua
    Lindberg, Borge
    Jensen, Soren Holdt
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1839 - 1843
  • [10] Text-dependent speaker recognition using speaker specific compensation
    Laxman, S
    Sastry, PS
    [J]. IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 384 - 387