Large-Scale Multimodal Movie Dialogue Corpus

被引:3
|
作者
Yasuhara, Ryu [1 ]
Inoue, Masashi [1 ]
Suga, Ikuya [1 ]
Kosaka, Tetsuo [1 ]
机构
[1] Yamagata Univ, 3-16,4 Jyonan, Yonezawa, Yamagata, Japan
关键词
Dialogue; Multimodal; Corpus; Movie; Film; VAD; DNN;
D O I
10.1145/2993148.2998523
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an outline of our newly created multimodal dialogue corpus that is constructed from public domain movies. Dialogues in movies are useful sources for analyzing human communication patterns. In addition, they can be used to train machine-learning-based dialogue processing systems. However, the movie files are processing intensive and they contain large portions of non-dialogue segments. Therefore, we created a corpus that contains only dialogue segments from movies. The corpus contains 165, 368 dialogue segments taken from 1, 722 movies. These dialogues are automatically segmented by using deep neural network-based voice activity detection with filtering rules. Our corpus can reduce the human workload and machine-processing effort required to analyze human dialogue behavior by using movies.
引用
收藏
页码:414 / 415
页数:2
相关论文
共 50 条
  • [32] BjTT: A Large-Scale Multimodal Dataset for Traffic Prediction
    Zhang, Chengyang
    Zhang, Yong
    Shao, Qitan
    Feng, Jiangtao
    Li, Bo
    Lv, Yisheng
    Piao, Xinglin
    Yin, Baocai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 18992 - 19003
  • [33] Zakim - A multimodal software system for large-scale teleconferencing
    Froumentin, M
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3361 : 46 - 55
  • [34] Multimodal and Multilingual Embeddings for Large-Scale Speech Mining
    Duquenne, Paul-Ambroise
    Gong, Hongyu
    Schwenk, Holger
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [35] The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service
    Chen, Meng
    Liu, Ruixue
    Shen, Lei
    Yuan, Shaozu
    Zhou, Jingyan
    Wu, Youzheng
    He, Xiaodong
    Zhou, Bowen
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 459 - 466
  • [36] Learning Fused Representations for Large-Scale Multimodal Classification
    Nawaz, Shah
    Calefati, Alessandro
    Janjua, Muhammad Kamran
    Anwaar, Muhammad Umer
    Gallo, Ignazio
    IEEE SENSORS LETTERS, 2019, 3 (01)
  • [37] Large-scale multimodal surface neural interfaces for primates
    Belloir, Tiphaine
    Montalvo-Vargo, Sergio
    Ahmed, Zabir
    Griggs, Devon J.
    Fisher, Shawn
    Brown, Timothy
    Chamanzar, Maysamreza
    Yazdan-Shahmorad, Azadeh
    ISCIENCE, 2023, 26 (01)
  • [38] MovieMaker: a parallel movie-making software for large-scale simulations
    Uehara, H.
    Kawahara, S.
    Ohno, N.
    Furuichi, M.
    Araki, F.
    Kageyama, A.
    JOURNAL OF PLASMA PHYSICS, 2006, 72 (06) : 841 - 844
  • [39] MEDIASUM: A Large-scale Media Interview Dataset for Dialogue Summarization
    Zhu, Chenguang
    Liu, Yang
    Mei, Jie
    Zeng, Michael
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5927 - 5934
  • [40] Multimodal Persuasive Dialogue Corpus Using a Teleoperated Android
    Kawano, Seiya
    Arioka, Muteki
    Yuguchi, Akishige
    Yamamoto, Kenta
    Inoue, Koji
    Kawahara, Tatsuya
    Nakamura, Satoshi
    Yoshino, Koichiro
    INTERSPEECH 2022, 2022, : 2308 - 2312