Singing voice synthesis based on deep neural networks

被引：55

作者：

Nishimura, Masanari ^{[1
]}

Hashimoto, Kei ^{[1
]}

Oura, Keiichiro ^{[1
]}

Nankaku, Yoshihiko ^{[1
]}

Tokuda, Keiichi ^{[1
]}

机构：

[1] Nagoya Inst Technol, Dept Sci & Engn Simulat, Nagoya, Aichi, Japan

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

基金：

日本科学技术振兴机构;

关键词：

Singing voice synthesis; Neural network; DNN; Acoustic model; HMM;

D O I：

10.21437/Interspeech.2016-1027

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Singing voice synthesis techniques have been proposed based on a hidden Markov model (HMM). In these approaches, the spectrum, excitation, and duration of singing voices are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. However, the quality of the synthesized singing voices still has not reached that of natural singing voices. Deep neural networks (DNNs) have largely improved on conventional approaches in various research areas including speech recognition, image recognition, speech synthesis, etc. The DNN-based text-to-speech (TTS) synthesis can synthesize high quality speech. In the DNN-based TTS system, a DNN is trained to represent the mapping function from contextual features to acoustic features, which are modeled by decision tree-clustered context dependent HMMs in the HMM-based TTS system. In this paper, we propose singing voice synthesis based on a DNN and evaluate its effectiveness. The relationship between the musical score and its acoustic features is modeled in frames by a DNN. For the sparseness of pitch context in a database, a musical-note-level pitch normalization and linear-interpolation techniques are used to prepare the excitation features. Subjective experimental results show that the DNN-based system outperformed the HMM-based system in terms of naturalness.

引用

页码：2478 / 2482

页数：5

共 50 条

[21] Exploring Channel Properties to Improve Singing Voice Detection with Convolutional Neural Networks
Gui, Wenming
Li, Yukun
Zang, Xian
Zhang, Jinglan
APPLIED SCIENCES-BASEL, 2021, 11 (24):
[22] Phoneme-to-audio alignment with recurrent neural networks for speaking and singing voice
Teytaut, Yann
Roebel, Axel
INTERSPEECH 2021, 2021, : 61 - 65
[23] Improving Singing Voice Separation Using Curriculum Learning on Recurrent Neural Networks
Kang, Seungtae
Park, Jeong-Sik
Jang, Gil-Jin
APPLIED SCIENCES-BASEL, 2020, 10 (07):
[24] Detection of Glottic Neoplasm Based on Voice Signals Using Deep Neural Networks
Wang, Chi-Te
Chuang, Zong-Ying
Hung, Chao-Hsiang
Tsao, Yu
Fang, Shih-Hau
IEEE SENSORS LETTERS, 2022, 6 (03)
[25] Computationally-efficient voice activity detection based on deep neural networks
Xiong, Yan
Berisha, Visar
Chakrabarti, Chaitali
2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 64 - 69
[26] VOICE SOURCE MODELLING USING DEEP NEURAL NETWORKS FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS
Raitio, Tuomo
Lu, Heng
Kane, John
Suni, Antti
Vainio, Martti
King, Simon
Alku, Paavo
2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 2290 - 2294
[27] Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CMNMF
Munoz-Montoro, Antonio J.
Politis, Archontis
Drossos, Konstantinos
Carabias-Orti, Julio J.
2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
[28] A Survey on Recent Deep Learning-driven Singing Voice Synthesis Systems
Cho, Yin-Ping
Yang, Fu-Rong
Chang, Yung-Chuan
Cheng, Ching-Ting
Wang, Xiao-Han
Liu, Yi-Wen
2021 4TH IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR 2021), 2021, : 319 - 323
[29] Mandarin singing voice synthesis using an HNM based scheme
Gu, Hung-Yan
Liau, Huang-Liang
CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 347 - 351
[30] Joint Detection and Classification of Singing Voice Melody Using Convolutional Recurrent Neural Networks
Kum, Sangeun
Nam, Juhan
APPLIED SCIENCES-BASEL, 2019, 9 (07):

← 1 2 3 4 5 →