Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired

被引：3

作者：

Padmini, Palli ^{[1
]}

Paramasivam, C. ^{[1
]}

Lal, G. Jyothish ^{[2
]}

Alharbi, Sadeen ^{[3
]}

Bhowmick, Kaustav ^{[4
]}

机构：

[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Elect & Commun Engn, Bengaluru, India

[2] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Ctr Computat Engn & Networking CEN, Coimbatore, Tamil Nadu, India

[3] King Saud Univ, Coll Comp & Informat Sci, Dept Software Engn, Riyadh, Saudi Arabia

[4] PES Univ, Dept Elect & Commun Engn, Bengaluru, India

来源：

CMC-COMPUTERS MATERIALS & CONTINUA | 2022年 / 70卷 / 02期

关键词：

Blood relations; KFCG; LBG; MFCC; vector quantization; correlation; speech samples; same-gender; dissimilar gender; voice conversion; PSOLA; SVM; ALGORITHM; SPEECH; PREVALENCE; CHILDREN; DELAY;

D O I：

10.32604/cmc.2022.020065

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The present work presents a statistical method to translate human voices across age groups, based on commonalities in voices of blood relations. The age-translated voices have been naturalized extracting the blood relation features e.g., pitch, duration, energy, using Mel Frequency Cepstrum Coefficients (MFCC), for social compatibility of the voice-impaired. The system has been demonstrated using standard English and an Indian language. The voice samples for resynthesis were derived from 12 families, with member ages ranging from 8-80 years. The voice-age translation, performed using the Pitch synchronous overlap and add (PSOLA) approach, by modulation of extracted voice features, was validated by perception test. The translated and resynthesized voices were correlated using Linde, Buzo, Gray (LBG), and Kekre's Fast Codebook generation (KFCG) algorithms. For translated voice targets, a strong (theta < similar to 93% and theta < similar to 96%) correlation was found with blood relatives, whereas, a weak (theta < similar to 78% and theta < similar to 80%) correlation range was found between different families and different gender from same families. The study further subcategorized the sampling and synthesis of the voices into similar or dissimilar gender groups, using a support vector machine (SVM) choosing between available voice samples. Finally, similar to 96%, similar to 93%, and similar to 94% accuracies were obtained in the identification of the gender of the voice sample, the age group samples, and the correlation between the original and converted voice samples, respectively. The results obtained were close to the natural voice sample features and are envisaged to facilitate a near-natural voice for speech-impaired easily.

引用

页码：4027 / 4051

页数：25

共 50 条

[1] Age-based automatic voice conversion using blood relation for voice impaired
Padmini, Palli
Paramasivam, C.
Lal, G. Jyothish
Alharbi, Sadeen
Bhowmick, Kaustav
Computers, Materials and Continua, 2022, 70 (02): : 4027 - 4051
[2] Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
Kobayashi, Kazuhiro
Toda, Tomoki
Doi, Hironori
Nakano, Tomoyasu
Goto, Masataka
Neubig, Graham
Sakti, Sakriani
Nakamura, Satoshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06): : 1419 - 1428
[3] Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
Kobayashi, Kazuhiro
Toda, Tomoki
Nakano, Tomoyasu
Goto, Masataka
Nakamura, Satoshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (11): : 2767 - 2777
[4] AUTOMATIC VOICE ANSWERBACK USING TEXT TO SPEECH CONVERSION BY RULE
DENES, PB
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 64 : S162 - S163
[5] Geriatric Dysphonia: Characteristics of Diagnoses in Age-Based Cohorts in a Tertiary Voice Clinic
Applebaum, Jeremy
Harun, Aisha
Davis, Ashley
Hillel, Alexander T.
Best, Simon R. A.
Akst, Lee M.
ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 2019, 128 (05): : 384 - 390
[6] Automatic source speaker selection for voice conversion
Turk, Oytun
Arslan, Levent M.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : 480 - 491
[7] Controllable voice conversion based on quantization of voice factor scores
Isako, Takumi
Onishi, Kotaro
Kishida, Takuya
Nakashika, Toru
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1444 - 1448
[8] Electrolarynx System Using Voice Conversion Based on WaveRNN
Urabe, Etsuro
Hirakawa, Rin
Kawano, Hideaki
Nakashi, Kenichi
Nakatoh, Yoshihisa
2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2020, : 726 - 727
[9] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
Jiao, Yishan
Xie, Xiang
Na, Xingyu
Tu, Ming
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[10] Automatic age detection in normal and pathological voice
Gomez-Garcia, J-A.
Moro-Velazquez, L.
Godino-Llorente, J-I.
Castellanos-Dominguez, G.
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3739 - 3743

← 1 2 3 4 5 →