Hi,KIA: A Speech Emotion Recognition Dataset for Wake-Up Words

被引:0
|
作者
Kim, Taesu [1 ]
Doh, SeungHeon [2 ]
Lee, Gyunpyo [1 ]
Jeon, Hyungseok [3 ]
Nam, Juhan [2 ]
Suk, Hyeon-Jeong [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Ind Design, Daejeon, South Korea
[2] Korea Adv Inst Sci & Technol, Grad Sch Culture Technol, Daejeon, South Korea
[3] Hyundai Motor Co, KIA Design Studio, Hwaseong, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Wake-up words (WUW) is a short sentence used to activate a speech recognition system to receive the user's speech input. WUW utterances include not only the lexical information for waking up the system but also non-lexical information such as speaker identity or emotion. In particular, recognizing the user's emotional state may elaborate the voice communication. However, there is few dataset where the emotional state of the WUW utterances is labeled. In this paper, we introduce Hi, KIA, a new WUW dataset which consists of 488 Korean accent emotional utterances collected from four male and four female speakers and each of utterances is labeled with four emotional states including anger, happy, sad, or neutral. We present the step-by-step procedure to build the dataset, covering scenario selection, post-processing, and human validation for label agreement. Also, we provide two classification models for WUW speech emotion recognition using the dataset. One is based on traditional handcraft features and the other is a transfer-learning approach using a pre-trained neural network. These classification models could be used as benchmarks in further research.
引用
收藏
页码:1590 / 1595
页数:6
相关论文
共 35 条
  • [21] Establishment and psychometric characteristics of emotional words list for suicidal risk assessment in speech emotion recognition
    Shen, Juan
    Zhang, Shuo
    Tong, Yongsheng
    Dong, Xiangmin
    Wang, Xuelian
    Fu, Guanghui
    Zhao, Liting
    Wu, Mengjie
    Yin, Yi
    Wang, Yuehua
    Liu, Nancy H.
    Wu, Jianlan
    Li, Jianqiang
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [22] Emotion Recognition from Speech Using the Bag-of-Visual Words on Audio Segment Spectrograms
    Spyrou, Evaggelos
    Nikopoulou, Rozalia
    Vernikos, Ioannis
    Mylonas, Phivos
    TECHNOLOGIES, 2019, 7 (01)
  • [23] Wake-up-word speech recognition application for first responder communication enhancement
    Kepuska, Veton
    Breitfeller, Jason
    SENSORS, AND COMMAND, CONTROL, COMMUNICATIONS, AND INTELLIGENCE (C31)TECHNOLOGIES FOR HOMELAND SECURITY AND HOMELAND DEFENSE V, 2006, 6201
  • [24] Optimizing Speech Emotion Recognition with Deep Learning and Grey Wolf Optimization: A Multi-Dataset Approach
    Tyagi, Suryakant
    Szenasi, Sandor
    ALGORITHMS, 2024, 17 (03)
  • [25] Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition
    Seo, Minji
    Kim, Myungho
    SENSORS, 2020, 20 (19) : 1 - 21
  • [26] Self-Defined Text-Dependent Wake-Up-Words Speaker Recognition System
    Tsai, Tsung-Han
    Hao, Ping-Cheng
    Wang, Chiao-Li
    IEEE ACCESS, 2021, 9 : 138668 - 138676
  • [27] A 2.1 μW Event-Driven Wake-Up Circuit Based on a Level-Crossing ADC for Pattern Recognition in Healthcare
    Rovere, Giovanni
    Fateh, Schekeb
    Benini, Luca
    2017 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS), 2017,
  • [28] Vocell: A 65-nm Speech-Triggered Wake-Up SoC for 10-μW Keyword Spotting and Speaker Verification
    Giraldo, Juan Sebastian P.
    Lauwereins, Steven
    Badami, Komail
    Verhelst, Marian
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (04) : 868 - 878
  • [29] DEEP NEURAL NETWORK BASED WAKE-UP-WORD SPEECH RECOGNITION WITH TWO-STAGE DETECTION
    Ge, Fengpei
    Yan, Yonghong
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2761 - 2765
  • [30] Two-stage Strategy for Small-footprint Wake-up-word Speech Recognition System
    You, Xinya
    Zhao, Yajie
    Sun, Mingyuan
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,