Speech compression with preservation of speaker identity

被引：0

作者：

Leis, J

Phythian, M

Sridharan, S

机构：

来源：

1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS | 1997年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Although much effort has been directed recently towards speech compression at rates below 4 kb/s, the primary metric for comparison has, understandably, been the amount of spectral distortion in the decompressed speech. However, an aspect which is becoming important in some applications is the ability to identify the original speaker from the coded speech algorithmically. We investigate here the effect of speech compression using multistage vector quantization of the short-term (formant) filter parameters on text-independent speaker identification. It is demonstrated that in cases where the speech is stored in a compressed database for retrieval, the speaker model should be constructed from the raw speech before spectral compression. Additionally, Gaussian models of sufficiently high order are able to reduce the negative effects of spectral vector quantization upon speaker identification accuracy.

引用

页码：1711 / 1714

页数：4

共 50 条

[1] Robust speech coding for the preservation of speaker identity
Phythian, M
Leis, J
Sridharan, S
[J]. ISSPA 96 - FOURTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 395 - 398
[2] SPEAKER IDENTITY PRESERVATION IN DYSARTHRIC SPEECH RECONSTRUCTION BY ADVERSARIAL SPEAKER ADAPTATION
Wang, Disong
Liu, Songxiang
Wu, Xixin
Lu, Hui
Sun, Lifa
Liu, Xunying
Meng, Helen
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6677 - 6681
[3] When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech
Tuninetti, Alba
Chladkova, Katerina
Peter, Varghese
Schiller, Niels O.
Escudero, Paola
[J]. BRAIN AND LANGUAGE, 2017, 174 : 42 - 49
[4] SPEAKER INTONATION ADAPTATION FOR TRANSFORMING TEXT-TO-SPEECH SYNTHESIS SPEAKER IDENTITY
Langarani, Mahsa Sadat Elyasi
van Santen, Jan
[J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 116 - 123
[5] Speaker dependent speech compression for low bandwidth communication
Pfister, HD
Pfister, HL
[J]. 1996 IEEE AEROSPACE APPLICATIONS CONFERENCE, PROCEEDINGS, VOL 2, 1996, : 373 - 374
[6] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
Chuang, Fu-Kai
Wang, Syu-Siang
Hung, Jeih-weih
Tsao, Yu
Fang, Shih-Hau
[J]. INTERSPEECH 2019, 2019, : 3173 - 3177
[7] Non-native Speaker Identity Verification Based on Speech
Wei, Hong
Yang, Jian
[J]. ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 6, PROCEEDINGS, 2008, : 59 - 62
[8] Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition
Hurmalainen, Antti
Saeidi, Rahim
Virtanen, Tuomas
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2135 - 2138
[9] Applying a speaker-dependent speech compression technique to concatenative TTS synthesizers
Lee, Chang-Heon
Jung, Sung-Kyo
Kang, Hong-Goo
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 632 - 640
[10] Manifestation of depression in speech overlaps with characteristics used to represent and recognize speaker identity
Dumpala, Sri Harsha
Dikaios, Katerina
Rodriguez, Sebastian
Langley, Ross
Rempel, Sheri
Uher, Rudolf
Oore, Sageev
[J]. SCIENTIFIC REPORTS, 2023, 13 (01)

← 1 2 3 4 5 →