ENABLING IMPROVED SPEAKER RECOGNITION BY VOICE QUALITY ESTIMATION

被引:0
|
作者
Bartos, Anthony L. [1 ]
Nelson, Douglas J. [2 ]
机构
[1] Assurance Technol Corp, Chantilly, VA 20151 USA
[2] Natl Secur Agcy, Ft George G Meade, MD 20755 USA
关键词
SAD; Speech Activity Detection; VAD; voice Activity Detection; SID; Speaker ID; LID; Language ID; EER; Equal Error Rate; VQE; Voice Quality Estimate; SNR; Signal to Noise Ratio;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Presented is a method to mitigate noise and interference in automated speaker identification (SID). This process uses the MIT/LL SID module without modifications. In this process, speaker models are built for a lattice of signal to noise ratio (SNR) levels. The SNR of the received signal is estimated by first applying speech activity detection to identify portions of the signal that actually contain speech. A voice quality estimation process is then applied to estimate the SNR of the received signal. The speaker models representing the SNR of the received signal are dynamically loaded, and conventional SID is applied. In training, the SNR of each training signal is estimated, and the signal is modified by adding noise to create a signal at the desired SNR. Using this process, each signal may be used to train models at any SNR level less than or equal to the SNR of the original signal. The process has been fully implemented and is completely automated.
引用
收藏
页码:595 / 599
页数:5
相关论文
共 50 条
  • [1] Speaker Identity and Voice Quality: Modeling Human Responses and Automatic Speaker Recognition
    Park, Soo Jin
    Sigouin, Caroline
    Kreiman, Jody
    Keating, Patricia
    Guo, Jinxi
    Yeung, Gary
    Kuo, Fang-Yu
    Alwan, Abeer
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1044 - 1048
  • [2] The relevance of voice quality features in speaker independent emotion recognition
    Lugger, Marko
    Yang, Bin
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 17 - +
  • [3] Spectral Matching Based Voice Activity Detector for Improved Speaker Recognition
    Sreekumar, K. T.
    George, Kuruvachan K.
    Arunraj, K.
    Kumar, C. Santhosh
    [J]. 2014 INTERNATIONAL CONFERENCE ON POWER SIGNALS CONTROL AND COMPUTATIONS (EPSCICON), 2014,
  • [4] Voice - recognition of speaker sex
    Kramer, Elena
    [J]. SPRACHE-STIMME-GEHOR, 2014, 38 (01): : 8 - 8
  • [5] Voice Disguise in Automatic Speaker Recognition
    Farrus, Mireia
    [J]. ACM COMPUTING SURVEYS, 2018, 51 (04)
  • [6] Speaker Recognition in Encrypted Voice Streams
    Backes, Michael
    Doychev, Goran
    Duermuth, Markus
    Koepf, Boris
    [J]. COMPUTER SECURITY-ESORICS 2010, 2010, 6345 : 508 - +
  • [7] Voice disguise and automatic speaker recognition
    Zhang, Cuiling
    Tan, Tiejun
    [J]. FORENSIC SCIENCE INTERNATIONAL, 2008, 175 (2-3) : 118 - 122
  • [8] SPECTRAL PATTERN-RECOGNITION OF IMPROVED VOICE QUALITY
    RIHKANEN, H
    LEINONEN, L
    HILTUNEN, T
    KANGAS, J
    [J]. JOURNAL OF VOICE, 1994, 8 (04) : 320 - 326
  • [9] Speaker Anonymity and Voice Conversion Vulnerability: A Speaker Recognition Analysis
    Saini, Shalini
    Saxena, Nitesh
    [J]. 2023 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY, CNS, 2023,
  • [10] Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
    He, Maokui
    Raj, Desh
    Huang, Zili
    Du, Jun
    Chen, Zhuo
    Watanabe, Shinji
    [J]. INTERSPEECH 2021, 2021, : 3555 - 3559