Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

被引：0

作者：

Lee, Seo-Hyun ^{[1
]}

Lee, Young-Eun ^{[1
]}

Kim, Soowon ^{[2
]}

Ko, Byung-Kwan ^{[2
]}

Kim, Jun-Young ^{[2
]}

机构：

[1] Korea Univ, Dept Brain & Cognit Engn, Seoul, South Korea

[2] Korea Univ, Dept Artificial Intelligence, Seoul, South Korea

来源：

2024 12TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE, BCI 2024 | 2024年

关键词：

brain-computer interface; deep neural networks; electroencephalogram; generative adversarial network; imagined speech; speech synthesis; COMMUNICATION; IMAGERY;

D O I：

10.1109/BCI60775.2024.10480503

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Brain-to-speech technology represents a fusion of interdisciplinary applications encompassing fields of artificial intelligence, brain-computer interfaces, and speech synthesis. Neural representation learning based intention decoding and speech synthesis directly connects the neural activity to the means of human linguistic communication, which may greatly enhance the naturalness of communication. With the current discoveries on representation learning and the development of the speech synthesis technologies, direct translation of brain signals into speech has shown great promise. Especially, the processed input features and neural speech embeddings which are given to the neural network play a significant role in the overall performance when using deep generative models for speech generation from brain signals. In this paper, we introduce the current brain-tospeech technology with the possibility of speech synthesis from brain signals, which may ultimately facilitate innovation in nonverbal communication. Also, we perform comprehensive analysis on the neural features and neural speech embeddings underlying the neurophysiological activation while performing speech, which may play a significant role in the speech synthesis works.

引用

页数：4

共 50 条

[1] Multi-mode Neural Speech Coding Based on Deep Generative Networks
Xiao, Wei
Liu, Wenzhe
Wang, Meng
Yang, Shan
Shi, Yupeng
Kang, Yuyong
Su, Dan
Shang, Shidong
Yu, Dong
INTERSPEECH 2023, 2023, : 819 - 823
[2] Learning and Modeling Unit Embeddings Using Deep Neural Networks for Unit-Selection-Based Mandarin Speech Synthesis
Zhou, Xiao
Ling, Zhen-Hua
Dai, Li-Rong
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)
[3] A Study on Tailor-Made Speech Synthesis Based on Deep Neural Networks
Yamada, Shuhei
Nose, Takashi
Ito, Akinori
ADVANCES IN INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL 1, 2017, 63 : 159 - 166
[4] Speech bandwidth expansion based on Deep Neural Networks
Wang, Yingxue
Zhao, Shenghui
Liu, Wenbo
Li, Ming
Kuang, Jingming
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2593 - 2597
[5] Mongolian Speech Recognition Based on Deep Neural Networks
Zhang, Hui
Bao, Feilong
Gao, Guanglai
CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015), 2015, 9427 : 180 - 188
[6] Czech Speech Synthesis with Generative Neural Vocoder
Vit, Jakub
Hanzlicek, Zdenek
Matousek, Jindrich
TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 307 - 315
[7] STATISTICAL PARAMETRIC SPEECH SYNTHESIS USING DEEP NEURAL NETWORKS
Zen, Heiga
Senior, Andrew
Schuster, Mike
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7962 - 7966
[8] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
Espana-Bonet, Cristina
Fonollosa, Jose A. R.
ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107
[9] Robust Speech Recognition with Speech Enhanced Deep Neural Networks
Du, Jun
Wang, Qing
Gao, Tian
Xu, Yong
Dai, Lirong
Lee, Chin-Hui
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 616 - 620
[10] The Representation of Speech in Deep Neural Networks
Scharenborg, Odette
van der Gouw, Nikki
Larson, Martha
Marchiori, Elena
MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 194 - 205

← 1 2 3 4 5 →