Design and Construction of Japanese Multimodal Utterance Corpus with Improved Emotion Balance and Naturalness

被引：0

作者：

Horii, Daisuke ^{[1
]}

Ito, Akinori ^{[1
]}

Nose, Takashi ^{[1
]}

机构：

[1] Tohoku Univ, Sendai, Japan

来源：

PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes the development of a corpus of multimodal emotional behaviors. So far, many databases of multimodal affective behaviors have been developed. These databases are divided into spontaneous and acted behavior databases. Acted behavior databases can easily collect words with a balanced number of emotions; however, it has been pointed out that acted speech differs from spontaneous speech. In this work, we aim to collect acted multimodal emotional utterances that sound as natural as possible. To this end, we first collected scenes from tweets in which emotional balance was considered. Then, we performed an initial corpus collection, demonstrating that we could collect various emotional utterances. Next, we collected the corpus using a crowdsourcing platform. Then, we evaluated the naturalness of the collected speech by comparing it with the naturalness of the read speech database (JTES) and the spontaneous speech database (SMOC). As a result, the collected corpus was more natural than JTES, which indicates that the recording program effectively collected naturally-sounding emotional behavior corpus.

引用

页码：245 / 250

页数：6

共 2 条

[1] Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition
Lubis, Nurul
Lestari, Dessi
Sakti, Sakriani
Purwarianti, Ayu
Nakamura, Satoshi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (08) : 2092 - 2100
[2] The Construction of Corpus Index in the Era of Big Data and Its Application Design in Japanese Teaching
Teng, Kun
Lecture Notes on Data Engineering and Communications Technologies, 2022, 84 : 370 - 378

← 1 →