Large-Scale Multimodal Movie Dialogue Corpus

被引：3

作者：

Yasuhara, Ryu ^{[1
]}

Inoue, Masashi ^{[1
]}

Suga, Ikuya ^{[1
]}

Kosaka, Tetsuo ^{[1
]}

机构：

[1] Yamagata Univ, 3-16,4 Jyonan, Yonezawa, Yamagata, Japan

来源：

ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION | 2016年

关键词：

Dialogue; Multimodal; Corpus; Movie; Film; VAD; DNN;

D O I：

10.1145/2993148.2998523

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an outline of our newly created multimodal dialogue corpus that is constructed from public domain movies. Dialogues in movies are useful sources for analyzing human communication patterns. In addition, they can be used to train machine-learning-based dialogue processing systems. However, the movie files are processing intensive and they contain large portions of non-dialogue segments. Therefore, we created a corpus that contains only dialogue segments from movies. The corpus contains 165, 368 dialogue segments taken from 1, 722 movies. These dialogues are automatically segmented by using deep neural network-based voice activity detection with filtering rules. Our corpus can reduce the human workload and machine-processing effort required to analyze human dialogue behavior by using movies.

引用

页码：414 / 415

页数：2

共 50 条

[41] A large-scale corpus for assessing written argumentation: PERSUADE 2.0
Crossley, S. A.
Tian, Y.
Baffour, P.
Franklin, A.
Benner, M.
Boser, U.
ASSESSING WRITING, 2024, 61
[42] Temporal knowledge extraction from large-scale text corpus
Yu Liu
Wen Hua
Xiaofang Zhou
World Wide Web, 2021, 24 : 135 - 156
[43] Rollenwechsel-English: a large-scale semantic role corpus
Sayeed, Asad
Shkadzko, Pavel
Demberg, Vera
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3087 - 3091
[44] NEWSFARM: A Large-Scale Chinese Corpus of Long News Summarization
Zang, Shunan
Zhang, Chuang
Liu, Xiaojun
Chen, Xiaojun
Zhang, Peng
Liu, Jie
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2260 - 2272
[45] A Solution to the Problems in Large-Scale Corpus Construction for Police Translation
Hao, Ding
PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL CONFERENCE ON LAW AND LANGUAGE OF THE INTERNATIONAL ACADEMY OF LINGUISTIC LAW (IALL2017): LAW, LANGUAGE AND JUSTICE, 2017, : 232 - 239
[46] Automatic Speech Recognition of Vietnamese for a New Large-Scale Corpus
Tran, Linh Thi Thuc
Kim, Han-Gyu
La, Hoang Minh
Pham, Su Van
ELECTRONICS, 2024, 13 (05)
[47] Creating A Large-Scale Financial News Corpus for Relation Extraction
Wu, Haoyu
Lei, Qing
Zhang, Xinyue
Luo, Zhengqian
2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 259 - 263
[48] Building a Large-scale Corpus for Evaluating Event Detection on Twitter
McMinn, Andrew J.
Moshfeghi, Yashar
Jose, Joemon M.
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 409 - 418
[49] Temporal knowledge extraction from large-scale text corpus
Liu, Yu
Hua, Wen
Zhou, Xiaofang
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 135 - 156
[50] Automatic label curation from large-scale text corpus
Avasthi, Sandhya
Chauhan, Ritu
ENGINEERING RESEARCH EXPRESS, 2024, 6 (01):

← 1 2 3 4 5 →