Towards a voice conversion system based on frame selection

被引:0
|
作者
Dutoit, T. [1 ]
Holzapfel, A. [2 ]
Jottrand, M. [1 ]
Moinet, A. [1 ]
Perez, J. [3 ]
Stylianou, Y. [2 ]
机构
[1] Fac Polytech Mons, Mons, Belgium
[2] Univ Crete, Iraklion, Greece
[3] Univ Politecn Cataluna, E-08028 Barcelona, Spain
关键词
voice conversion; frame selection; voice mapping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The subject of this paper is the conversion of a given speaker's voice (the source speaker) into another identified voice (the target one). We assume we have at our disposal a large amount of speech samples from source and target voice with at least a part of them being parallel. The proposed system is built on a mapping function between source and target spectral envelopes followed by a frame selection algorithm to produce final spectral envelopes. Converted speech is produced by a basic LP analysis of the source and LP synthesis using the converted spectral envelopes. We compared three types of conversion: without mapping, with mapping and using the excitation of the source speaker and finally with mapping using the excitation of the target. Results show that the combination of mapping and frame selection provide the best results, and underline the interest to work on methods to convert the LP excitation.
引用
收藏
页码:513 / +
页数:2
相关论文
共 50 条
  • [31] Improving the Efficiency of Dysarthria Voice Conversion System Based on Data Augmentation
    Zheng, Wei-Zhong
    Han, Ji-Yan
    Chen, Chen-Yu
    Chang, Yuh-Jer
    Lai, Ying-Hui
    [J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 4613 - 4623
  • [32] Data augmentation based non-parallel voice conversion with frame-level speaker disentangler
    Chen, Bo
    Xu, Zhihang
    Yu, Kai
    [J]. SPEECH COMMUNICATION, 2022, 136 : 14 - 22
  • [33] A key frame selection-based facial expression recognition system
    Guo, S. M.
    Pan, Y. A.
    Liao, Y. C.
    Hsu, C. Y.
    Tsai, J. S. H.
    Chang, C. I.
    [J]. ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 3, PROCEEDINGS, 2006, : 341 - +
  • [34] Towards the creation of reliable voice control system based on a fuzzy approach
    Savchenko, Andrey V.
    Savchenko, Liudmila V.
    [J]. PATTERN RECOGNITION LETTERS, 2015, 65 : 145 - 151
  • [35] Towards Bandwidth Efficient TDMA Frame Structure for Voice Traffic in MANETs
    Vattikuti, Naresh
    Dasari, Mallesham
    Sindhwal, Himanshu
    Tamma, Bheemarjuna Reddy
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES (CONECCT), 2015,
  • [36] AN INTELLIGENT FRAME SYSTEM FOR CULTIVAR SELECTION
    BOLTE, JP
    HANNAWAY, DB
    SHULER, PE
    BALLERSTEDT, PJ
    [J]. AI APPLICATIONS, 1991, 5 (03): : 21 - 31
  • [37] STATISTICAL VOICE CONVERSION BASED ON WAVENET
    Niwa, Jumpei
    Yoshimura, Takenori
    Hashimoto, Kei
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5289 - 5293
  • [38] VTLN-based voice conversion
    Sündermann, D
    Ney, H
    [J]. PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2003, : 556 - 559
  • [39] Controllable voice conversion based on quantization of voice factor scores
    Isako, Takumi
    Onishi, Kotaro
    Kishida, Takuya
    Nakashika, Toru
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1444 - 1448
  • [40] A system for voice conversion based on probabilistic classification and a harmonic plus noise model
    Stylianou, Y
    Cappe, O
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 281 - 284