Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired

被引:3
|
作者
Padmini, Palli [1 ]
Paramasivam, C. [1 ]
Lal, G. Jyothish [2 ]
Alharbi, Sadeen [3 ]
Bhowmick, Kaustav [4 ]
机构
[1] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Dept Elect & Commun Engn, Bengaluru, India
[2] Amrita Vishwa Vidyapeetham, Amrita Sch Engn, Ctr Computat Engn & Networking CEN, Coimbatore, Tamil Nadu, India
[3] King Saud Univ, Coll Comp & Informat Sci, Dept Software Engn, Riyadh, Saudi Arabia
[4] PES Univ, Dept Elect & Commun Engn, Bengaluru, India
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2022年 / 70卷 / 02期
关键词
Blood relations; KFCG; LBG; MFCC; vector quantization; correlation; speech samples; same-gender; dissimilar gender; voice conversion; PSOLA; SVM; ALGORITHM; SPEECH; PREVALENCE; CHILDREN; DELAY;
D O I
10.32604/cmc.2022.020065
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The present work presents a statistical method to translate human voices across age groups, based on commonalities in voices of blood relations. The age-translated voices have been naturalized extracting the blood relation features e.g., pitch, duration, energy, using Mel Frequency Cepstrum Coefficients (MFCC), for social compatibility of the voice-impaired. The system has been demonstrated using standard English and an Indian language. The voice samples for resynthesis were derived from 12 families, with member ages ranging from 8-80 years. The voice-age translation, performed using the Pitch synchronous overlap and add (PSOLA) approach, by modulation of extracted voice features, was validated by perception test. The translated and resynthesized voices were correlated using Linde, Buzo, Gray (LBG), and Kekre's Fast Codebook generation (KFCG) algorithms. For translated voice targets, a strong (theta < similar to 93% and theta < similar to 96%) correlation was found with blood relatives, whereas, a weak (theta < similar to 78% and theta < similar to 80%) correlation range was found between different families and different gender from same families. The study further subcategorized the sampling and synthesis of the voices into similar or dissimilar gender groups, using a support vector machine (SVM) choosing between available voice samples. Finally, similar to 96%, similar to 93%, and similar to 94% accuracies were obtained in the identification of the gender of the voice sample, the age group samples, and the correlation between the original and converted voice samples, respectively. The results obtained were close to the natural voice sample features and are envisaged to facilitate a near-natural voice for speech-impaired easily.
引用
收藏
页码:4027 / 4051
页数:25
相关论文
共 50 条
  • [21] STATISTICAL VOICE CONVERSION BASED ON WAVENET
    Niwa, Jumpei
    Yoshimura, Takenori
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5289 - 5293
  • [22] VTLN-based voice conversion
    Sündermann, D
    Ney, H
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 556 - 559
  • [23] Sequential Voice Conversion Using Grid-Based Approximation
    Benisty, Hadas
    Malah, David
    Crammer, Koby
    2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
  • [24] Cepstrum Liftering based Voice Conversion using RBF and GMM
    Nirmal, Jagannath
    Kachare, Pramod
    Patnaik, Suprava
    Zaveri, Mukesh
    2013 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2013, : 570 - 575
  • [25] Evaluation of speaker de-identification based on voice gender and age conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2018, 69 (02): : 138 - 147
  • [26] GMM-Based Speaker Gender and Age Classification After Voice Conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
  • [27] MULTI VOICE TEXT TO SPEECH SYNTHESIS BASED ON THE INSTANTANEOUS PARAMETRIC VOICE CONVERSION
    Azarov, Elias
    Petrovsky, Alexander
    Zubrycki, Piotr
    SPA 2010: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2010, : 78 - 82
  • [28] THE ANALYSIS OF THE OCCURRENCE OF VOICE DISORDERS IN RELATION TO OBSERVING THE PRINCIPLES OF VOICE HYGIENE AND THE AGE OF UNIVERSITY TEACHERS
    Vitaskova, Katerina
    Krajci, Anna
    Kytnarova, Lucie
    EDULEARN18: 10TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2018, : 11125 - 11132
  • [29] Voice conversion using Viterbi algorithm based on Gaussian mixture model
    Jian Zhi-Hua
    Yang Zhen
    2007 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2007, : 40 - 43
  • [30] Robust Voice conversion systems using MFDWC
    Farhid, M.
    Tinati, M. A.
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 778 - 781