Speech compression with preservation of speaker identity

被引:0
|
作者
Leis, J
Phythian, M
Sridharan, S
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although much effort has been directed recently towards speech compression at rates below 4 kb/s, the primary metric for comparison has, understandably, been the amount of spectral distortion in the decompressed speech. However, an aspect which is becoming important in some applications is the ability to identify the original speaker from the coded speech algorithmically. We investigate here the effect of speech compression using multistage vector quantization of the short-term (formant) filter parameters on text-independent speaker identification. It is demonstrated that in cases where the speech is stored in a compressed database for retrieval, the speaker model should be constructed from the raw speech before spectral compression. Additionally, Gaussian models of sufficiently high order are able to reduce the negative effects of spectral vector quantization upon speaker identification accuracy.
引用
收藏
页码:1711 / 1714
页数:4
相关论文
共 50 条
  • [1] Robust speech coding for the preservation of speaker identity
    Phythian, M
    Leis, J
    Sridharan, S
    [J]. ISSPA 96 - FOURTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 395 - 398
  • [2] SPEAKER IDENTITY PRESERVATION IN DYSARTHRIC SPEECH RECONSTRUCTION BY ADVERSARIAL SPEAKER ADAPTATION
    Wang, Disong
    Liu, Songxiang
    Wu, Xixin
    Lu, Hui
    Sun, Lifa
    Liu, Xunying
    Meng, Helen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6677 - 6681
  • [3] When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech
    Tuninetti, Alba
    Chladkova, Katerina
    Peter, Varghese
    Schiller, Niels O.
    Escudero, Paola
    [J]. BRAIN AND LANGUAGE, 2017, 174 : 42 - 49
  • [4] SPEAKER INTONATION ADAPTATION FOR TRANSFORMING TEXT-TO-SPEECH SYNTHESIS SPEAKER IDENTITY
    Langarani, Mahsa Sadat Elyasi
    van Santen, Jan
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 116 - 123
  • [5] Speaker dependent speech compression for low bandwidth communication
    Pfister, HD
    Pfister, HL
    [J]. 1996 IEEE AEROSPACE APPLICATIONS CONFERENCE, PROCEEDINGS, VOL 2, 1996, : 373 - 374
  • [6] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
    Chuang, Fu-Kai
    Wang, Syu-Siang
    Hung, Jeih-weih
    Tsao, Yu
    Fang, Shih-Hau
    [J]. INTERSPEECH 2019, 2019, : 3173 - 3177
  • [7] Non-native Speaker Identity Verification Based on Speech
    Wei, Hong
    Yang, Jian
    [J]. ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 6, PROCEEDINGS, 2008, : 59 - 62
  • [8] Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition
    Hurmalainen, Antti
    Saeidi, Rahim
    Virtanen, Tuomas
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2135 - 2138
  • [9] Applying a speaker-dependent speech compression technique to concatenative TTS synthesizers
    Lee, Chang-Heon
    Jung, Sung-Kyo
    Kang, Hong-Goo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 632 - 640
  • [10] Manifestation of depression in speech overlaps with characteristics used to represent and recognize speaker identity
    Dumpala, Sri Harsha
    Dikaios, Katerina
    Rodriguez, Sebastian
    Langley, Ross
    Rempel, Sheri
    Uher, Rudolf
    Oore, Sageev
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)