Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired

被引:3
|
作者
Padmini, Palli [1 ]
Paramasivam, C. [1 ]
Lal, G. Jyothish [2 ]
Alharbi, Sadeen [3 ]
Bhowmick, Kaustav [4 ]
机构
[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Elect & Commun Engn, Bengaluru, India
[2] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Ctr Computat Engn & Networking CEN, Coimbatore, Tamil Nadu, India
[3] King Saud Univ, Coll Comp & Informat Sci, Dept Software Engn, Riyadh, Saudi Arabia
[4] PES Univ, Dept Elect & Commun Engn, Bengaluru, India
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2022年 / 70卷 / 02期
关键词
Blood relations; KFCG; LBG; MFCC; vector quantization; correlation; speech samples; same-gender; dissimilar gender; voice conversion; PSOLA; SVM; ALGORITHM; SPEECH; PREVALENCE; CHILDREN; DELAY;
D O I
10.32604/cmc.2022.020065
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The present work presents a statistical method to translate human voices across age groups, based on commonalities in voices of blood relations. The age-translated voices have been naturalized extracting the blood relation features e.g., pitch, duration, energy, using Mel Frequency Cepstrum Coefficients (MFCC), for social compatibility of the voice-impaired. The system has been demonstrated using standard English and an Indian language. The voice samples for resynthesis were derived from 12 families, with member ages ranging from 8-80 years. The voice-age translation, performed using the Pitch synchronous overlap and add (PSOLA) approach, by modulation of extracted voice features, was validated by perception test. The translated and resynthesized voices were correlated using Linde, Buzo, Gray (LBG), and Kekre's Fast Codebook generation (KFCG) algorithms. For translated voice targets, a strong (theta < similar to 93% and theta < similar to 96%) correlation was found with blood relatives, whereas, a weak (theta < similar to 78% and theta < similar to 80%) correlation range was found between different families and different gender from same families. The study further subcategorized the sampling and synthesis of the voices into similar or dissimilar gender groups, using a support vector machine (SVM) choosing between available voice samples. Finally, similar to 96%, similar to 93%, and similar to 94% accuracies were obtained in the identification of the gender of the voice sample, the age group samples, and the correlation between the original and converted voice samples, respectively. The results obtained were close to the natural voice sample features and are envisaged to facilitate a near-natural voice for speech-impaired easily.
引用
收藏
页码:4027 / 4051
页数:25
相关论文
共 50 条
  • [1] Age-based automatic voice conversion using blood relation for voice impaired
    Padmini, Palli
    Paramasivam, C.
    Lal, G. Jyothish
    Alharbi, Sadeen
    Bhowmick, Kaustav
    Computers, Materials and Continua, 2022, 70 (02): : 4027 - 4051
  • [2] Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
    Kobayashi, Kazuhiro
    Toda, Tomoki
    Doi, Hironori
    Nakano, Tomoyasu
    Goto, Masataka
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06): : 1419 - 1428
  • [3] Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion
    Kobayashi, Kazuhiro
    Toda, Tomoki
    Nakano, Tomoyasu
    Goto, Masataka
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (11): : 2767 - 2777
  • [4] AUTOMATIC VOICE ANSWERBACK USING TEXT TO SPEECH CONVERSION BY RULE
    DENES, PB
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 64 : S162 - S163
  • [5] Geriatric Dysphonia: Characteristics of Diagnoses in Age-Based Cohorts in a Tertiary Voice Clinic
    Applebaum, Jeremy
    Harun, Aisha
    Davis, Ashley
    Hillel, Alexander T.
    Best, Simon R. A.
    Akst, Lee M.
    ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 2019, 128 (05): : 384 - 390
  • [6] Automatic source speaker selection for voice conversion
    Turk, Oytun
    Arslan, Levent M.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (01): : 480 - 491
  • [7] Controllable voice conversion based on quantization of voice factor scores
    Isako, Takumi
    Onishi, Kotaro
    Kishida, Takuya
    Nakashika, Toru
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1444 - 1448
  • [8] Electrolarynx System Using Voice Conversion Based on WaveRNN
    Urabe, Etsuro
    Hirakawa, Rin
    Kawano, Hideaki
    Nakashi, Kenichi
    Nakatoh, Yoshihisa
    2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2020, : 726 - 727
  • [9] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] Automatic age detection in normal and pathological voice
    Gomez-Garcia, J-A.
    Moro-Velazquez, L.
    Godino-Llorente, J-I.
    Castellanos-Dominguez, G.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3739 - 3743