Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words

被引:0
|
作者
Kim, Taesu [1 ]
Doh, SeungHeon [2 ]
Lee, Gyunpyo [1 ]
Jeon, Hyungseok [3 ]
Nam, Juhan [2 ]
Suk, Hyeon-Jeong [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Ind Design, Daejeon, South Korea
[2] Korea Adv Inst Sci & Technol, Grad Sch Culture Technol, Daejeon, South Korea
[3] Hyundai Motor Co, KIA Design Studio, Hwaseong, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where the emotional state of the WUW utterances is labeled. In this paper, we introduce Hi, KIA, a new WUW dataset which consists of 488 Korean accent emotional utterances collected from four male and four female speakers and each of utterances is labeled with four emotional states including anger, happy, sad, or neutral. We present the step-by-step procedure to build the dataset, covering scenario selection, post-processing, and human validation for label agreement. Also, we provide two classification models for WUW speech emotion recognition using the dataset. One is based on traditional handcraft features and the other is a transfer-learning approach using a pre-trained neural network. These classification models could be used as benchmarks in further research.
引用
收藏
页码:1590 / 1595
页数:6
相关论文
共 35 条
  • [1] A WAKE-UP CALL ON WORDS
    KASHMANIAN, RM
    BIOCYCLE, 1994, 35 (08) : 86 - 86
  • [2] BanglaSER: A speech emotion recognition dataset for the Bangla language
    Das, Rakesh Kumar
    Islam, Nahidul
    Ahmed, Md. Rayhan
    Islam, Salekul
    Shatabda, Swakkhar
    Islam, A. K. M. Muzahidul
    DATA IN BRIEF, 2022, 42
  • [3] A Dataset for Speech Emotion Recognition in Greek Theatrical Plays
    Moutti, Maria
    Eleftheriou, Sofia
    Koromilas, Panagiotis
    Giannakopoulos, Theodoros
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1040 - 1046
  • [4] FakeWake: Understanding and Mitigating Fake Wake-up Words of Voice Assistants
    Chen, Yanjiao
    Bai, Yijie
    Mitev, Richard
    Wang, Kaibo
    Sadeghi, Ahmad-Reza
    Xu, Wenyuan
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 1861 - 1883
  • [5] Improving speech emotion recognition based on acoustic words emotion dictionary
    Wei, Wang
    Cao, Xinyi
    Li, He
    Shen, Lingjie
    Feng, Yaqin
    Watters, Paul A.
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (06) : 747 - 761
  • [6] A Digital Capacitive MEMS Microphone for Speech Recognition With Fast Wake-Up Feature Using a Sound Activity Detector
    Yang, Youngtae
    Lee, Byunggyu
    Cho, Jun Soo
    Kim, Suhwan
    Lee, Hyunjoong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (09) : 1509 - 1513
  • [7] Emotion Recognition from Speech - an LSTM approach with the Tess Dataset
    Pandiammal, Sankara K.
    Karishma, S.
    Sakthe, Harine K.
    Manimaran, V
    Kalaiselvi, S.
    Anitha, V
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [8] EMOTION CONTROLLABLE SPEECH SYNTHESIS USING EMOTION-UNLABELED DATASET WITH THE ASSISTANCE OF CROSS-DOMAIN SPEECH EMOTION RECOGNITION
    Cai, Xiong
    Dai, Dongyang
    Wu, Zhiyong
    Li, Xiang
    Li, Jingbei
    Meng, Helen
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5734 - 5738
  • [9] Fully integrated 500uW Speech Detection Wake-Up Circuit
    Delbruck, Tobi
    Koch, Thomas
    Berner, Raphael
    Hermansky, Hynek
    2010 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, 2010, : 2015 - 2018
  • [10] LSSED: A LARGE-SCALE DATASET AND BENCHMARK FOR SPEECH EMOTION RECOGNITION
    Fan, Weiquan
    Xu, Xiangmin
    Xing, Xiaofen
    Chen, Weidong
    Huang, Dongyan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 641 - 645