Diverse style oriented many-to-many emotional voice conversion

被引:0
|
作者
Zhou, Jian [1 ]
Luo, Xiangyu [1 ]
Wang, Huabin [1 ]
Zheng, Wenming [2 ]
Tao, Liang [1 ]
机构
[1] Key Laboratory of Intelligent Computing and Signal Processing, Anhui University, Hefei,230601, China
[2] Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing,210096, China
来源
Shengxue Xuebao/Acta Acustica | 2024年 / 49卷 / 06期
关键词
Network coding - Speech enhancement;
D O I
10.12395/0371-0025.2023192
中图分类号
学科分类号
摘要
To address the issues of insufficient emotional separation and lack of diversity in emotional expression in existing generative adversarial network (GAN)-based emotional voice conversion methods, this paper proposes a many-to-many speech emotional voice conversion method aimed at style diversification. The method is based on a GAN model with a dual-generator structure, where a consistency loss is applied to the latent representations of different generators to ensure the consistency of speech content and speaker characteristics, thereby improving the similarity between the converted speech emotion and the target emotion. Additionally, this method utilizes an emotion mapping network and emotion feature encoder to provide diversified emotional representations of the same emotion category for the generators. Experimental results show that the proposed emotion conversion method yields speech emotions that are closer to the target emotion, with a richer variety of emotional styles. © 2024 Science Press. All rights reserved.
引用
收藏
页码:1297 / 1303
相关论文
共 50 条
  • [11] MANY-TO-MANY VOICE CONVERSION USING CONDITIONAL CYCLE-CONSISTENT ADVERSARIAL NETWORKS
    Lee, Shindong
    Ko, BongGu
    Lee, Keonnyeong
    Yoo, In-Chul
    Yook, Dongsuk
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6279 - 6283
  • [12] Choosing only the best voice imitators: Top-K many-to-many voice conversion with StarGAN
    Fernandez-Martin, Claudio
    Colomer, Adrian
    Panariello, Claudio
    Naranjo, Valery
    SPEECH COMMUNICATION, 2024, 156
  • [13] Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization
    Aihara, Ryo
    Takiguchi, Testuya
    Ariki, Yasuo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2749 - 2753
  • [14] NON-PARALLEL MANY-TO-MANY VOICE CONVERSION USING LOCAL LINGUISTIC TOKENS
    Wang, Chao
    Yu, Yibiao
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5929 - 5933
  • [15] Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network
    Zhou, Yi
    Tian, Xiaohai
    Das, Rohan Kumar
    Li, Haizhou
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1282 - 1287
  • [16] Many-to-Many and Completely Parallel-Data-Free Voice Conversion Based on Eigenspace DNN
    Hashimoto, Tetsuya
    Saito, Daisuke
    Minematsu, Nobuaki
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (02) : 332 - 341
  • [17] TEXT-FREE NON-PARALLEL MANY-TO-MANY VOICE CONVERSION USING NORMALISING FLOWS
    Merritt, Thomas
    Ezzerg, Abdelhamid
    Bilinski, Piotr
    Proszewska, Magdalena
    Pokora, Kamil
    Barra-Chicote, Roberto
    Korzekwa, Daniel
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6782 - 6786
  • [18] A Survey on Generative Adversarial Networks based Models for Many-to-many Non-parallel Voice Conversion
    Alaa, Yasmin
    Alfonse, Marco
    Aref, Mostafa M.
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 221 - 226
  • [19] Improving the Speaker Identity of Non-Parallel Many-to-Many Voice Conversion with Adversarial Speaker Recognition
    Ding, Shaojin
    Zhao, Guanlong
    Gutierrez-Osuna, Ricardo
    INTERSPEECH 2020, 2020, : 776 - 780
  • [20] Many-to-Many Unsupervised Speech Conversion From Nonparallel Corpora
    Lee, Yun Kyung
    Kim, Hyun Woo
    Park, Jeon Gue
    IEEE ACCESS, 2021, 9 : 27278 - 27286