Towards a voice conversion system based on frame selection

被引:0
|
作者
Dutoit, T. [1 ]
Holzapfel, A. [2 ]
Jottrand, M. [1 ]
Moinet, A. [1 ]
Perez, J. [3 ]
Stylianou, Y. [2 ]
机构
[1] Fac Polytech Mons, Mons, Belgium
[2] Univ Crete, Iraklion, Greece
[3] Univ Politecn Cataluna, E-08028 Barcelona, Spain
关键词
voice conversion; frame selection; voice mapping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The subject of this paper is the conversion of a given speaker's voice (the source speaker) into another identified voice (the target one). We assume we have at our disposal a large amount of speech samples from source and target voice with at least a part of them being parallel. The proposed system is built on a mapping function between source and target spectral envelopes followed by a frame selection algorithm to produce final spectral envelopes. Converted speech is produced by a basic LP analysis of the source and LP synthesis using the converted spectral envelopes. We compared three types of conversion: without mapping, with mapping and using the excitation of the source speaker and finally with mapping using the excitation of the target. Results show that the combination of mapping and frame selection provide the best results, and underline the interest to work on methods to convert the LP excitation.
引用
收藏
页码:513 / +
页数:2
相关论文
共 50 条
  • [1] AN IMPROVED FRAME-UNIT-SELECTION BASED VOICE CONVERSION SYSTEM WITHOUT PARALLEL TRAINING DATA
    Xie, Feng-Long
    Li, Xin-Hui
    Liu, Bo
    Zheng, Yi-Bin
    Meng, Li
    Lu, Li
    Soong, Frank K.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7754 - 7758
  • [2] Improving Segmental GMM Based Voice Conversion Method with Target Frame Selection
    Gu, Hung-Yan
    Tsai, Sung-Fung
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 483 - 487
  • [3] Voice Conversion using K-Histograms and Frame Selection
    Jose Uriz, Alejandro
    Daniel Agueero, Pablo
    Bonafonte, Antonio
    Carlos Tulli, Juan
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1607 - +
  • [4] Voice conversion: Wavelet based residual selection
    Kachare, Pramod
    Cheeran, Alice
    Nirmal, Jagganath
    Zaveri, Mukesh
    [J]. 2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1513 - 1518
  • [5] Voice Conversion Method Combining Segmental GMM Mapping with Target Frame Selection
    Gu, Hung-Yan
    Tsai, Sung-Feng
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2015, 31 (02) : 609 - 626
  • [6] Frame Correlation Based Autoregressive GMM Method for Voice Conversion
    Li, Xian
    Wang, Zeng-fu
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 221 - 225
  • [7] HMM-BASED SEQUENCE-TO-FRAME MAPPING FOR VOICE CONVERSION
    Qiao, Yu
    Saito, Daisuke
    Minematsu, Nobuaki
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4830 - 4833
  • [8] First steps towards new Czech voice conversion system
    Hanzlicek, Zdenek
    Matousek, Jindrich
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 383 - 390
  • [9] Text-independent voice conversion based on unit selection
    Suendermann, David
    Hoege, Harald
    Bonafonte, Antonio
    Ney, Hermann
    Black, Alan
    Narayanan, Shri
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 81 - 84
  • [10] Conversion function clustering and selection for expressive voice conversion
    Hsia, Chi-Chun
    Wu, Chung-Hsien
    Wu, Jian-Qi
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 689 - +