Relating EEG to continuous speech using deep neural networks: a review

被引:20
|
作者
Puffay, Corentin [1 ,2 ]
Accou, Bernd [1 ,2 ]
Bollens, Lies [1 ,2 ]
Monesi, Mohammad Jalilpour [1 ,2 ]
Vanthornhout, Jonas [1 ]
Van Hamme, Hugo [2 ]
Francart, Tom [1 ]
机构
[1] Katholieke Univ Leuven, Dept Neurosci, ExpORL, Leuven, Belgium
[2] Katholieke Univ Leuven, Dept Elect Engn ESAT, PSI, Leuven, Belgium
关键词
EEG; deep learning; speech; auditory neuroscience; ENTRAINMENT; FREQUENCY; BRAIN; LEVEL;
D O I
10.1088/1741-2552/ace73f
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective. When a person listens to continuous speech, a corresponding response is elicited in the brain and can be recorded using electroencephalography (EEG). Linear models are presently used to relate the EEG recording to the corresponding speech signal. The ability of linear models to find a mapping between these two signals is used as a measure of neural tracking of speech. Such models are limited as they assume linearity in the EEG-speech relationship, which omits the nonlinear dynamics of the brain. As an alternative, deep learning models have recently been used to relate EEG to continuous speech. Approach. This paper reviews and comments on deep-learning-based studies that relate EEG to continuous speech in single- or multiple-speakers paradigms. We point out recurrent methodological pitfalls and the need for a standard benchmark of model analysis. Main results. We gathered 29 studies. The main methodological issues we found are biased cross-validations, data leakage leading to over-fitted models, or disproportionate data size compared to the model's complexity. In addition, we address requirements for a standard benchmark model analysis, such as public datasets, common evaluation metrics, and good practices for the match-mismatch task. Significance. We present a review paper summarizing the main deep-learning-based studies that relate EEG to speech while addressing methodological pitfalls and important considerations for this newly expanding field. Our study is particularly relevant given the growing application of deep learning in EEG-speech decoding.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Enhanced speech emotion detection using deep neural networks
    Lalitha, S.
    Tripathi, Shikha
    Gupta, Deepa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 497 - 510
  • [22] Enhancing analysis of diadochokinetic speech using deep neural networks
    Segal-Feldman, Yael
    Hitczenko, Kasia
    Goldrick, Matthew
    Buchwald, Adam
    Roberts, Angela
    Keshet, Joseph
    COMPUTER SPEECH AND LANGUAGE, 2025, 90
  • [23] Decoding of the speech envelope from EEG using the VLAAI deep neural network
    Bernd Accou
    Jonas Vanthornhout
    Hugo Van hamme
    Tom Francart
    Scientific Reports, 13
  • [24] Decoding of the speech envelope from EEG using the VLAAI deep neural network
    Accou, Bernd
    Vanthornhout, Jonas
    Van Hamme, Hugo
    Francart, Tom
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [25] Speech Emotion Recognition using Convolution Neural Networks and Deep Stride Convolutional Neural Networks
    Wani, Taiba Majid
    Gunawan, Teddy Surya
    Qadri, Syed Asif Ahmad
    Mansor, Hasmah
    Kartiwi, Mira
    Ismail, Nanang
    PROCEEDING OF 2020 6TH INTERNATIONAL CONFERENCE ON WIRELESS AND TELEMATICS (ICWT), 2020,
  • [26] The Representation of Speech in Deep Neural Networks
    Scharenborg, Odette
    van der Gouw, Nikki
    Larson, Martha
    Marchiori, Elena
    MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 194 - 205
  • [27] Motor Imagery EEG Signal Classification Using Deep Neural Networks
    Nakra, Abhilasha
    Duhan, Manoj
    COMPUTING SCIENCE, COMMUNICATION AND SECURITY, 2022, 1604 : 128 - 140
  • [28] Prediction of Visual Memorability with EEG Signals using Deep Neural Networks
    Jo, Sang-Yeong
    Jeong, Jin-Woo
    2020 8TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2020, : 132 - 137
  • [29] Continuous mandarin speech recognition using hierarchical recurrent neural networks
    Liao, YF
    Chen, WY
    Chen, SH
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3370 - 3373
  • [30] Data Augmentation for Deep Neural Networks Model in EEG Classification Task: A Review
    He, Chao
    Liu, Jialu
    Zhu, Yuesheng
    Du, Wencai
    FRONTIERS IN HUMAN NEUROSCIENCE, 2021, 15