COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis

被引:104
|
作者
Tang, Yansong [1 ]
Ding, Dajun [2 ]
Rao, Yongming [1 ]
Zheng, Yu [1 ]
Zhang, Danyang [1 ]
Zhao, Lili [2 ]
Lu, Jiwen [1 ]
Zhou, Jie [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[2] Meitu Inc, Xiamen, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
RECOGNITION;
D O I
10.1109/CVPR.2019.00130
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are substantial instructional videos on the Internet, which enables us to acquire knowledge for completing various tasks. However, most existing datasets for instructional video analysis have the limitations in diversity and scale, which makes them far from many real-world applications where more diverse activities occur. Moreover, it still remains a great challenge to organize and harness such data. To address these problems, we introduce a large-scale dataset called "COIN" for COmprehensive INstructional video analysis. Organized with a hierarchical structure, the COIN dataset contains 11,827 videos of 180 tasks in 12 domains (e.g., vehicles, gadgets, etc.) related to our daily life. With a new developed toolbox, all the videos are annotated effectively with a series of step descriptions and the corresponding temporal boundaries. Furthermore, we propose a simple yet effective method to capture the dependencies among different steps, which can be easily plugged into conventional proposal-based action detection methods for localizing important steps in instructional videos. In order to provide a benchmark for instructional video analysis, we evaluate plenty of approaches on the COIN dataset under different evaluation criteria. We expect the introduction of the COIN dataset will promote the future in-depth research on instructional video analysis for the community.
引用
收藏
页码:1207 / 1216
页数:10
相关论文
共 50 条
  • [31] Compact representation for large-scale unconstrained video analysis
    Wang, Sen
    Pan, Pingbo
    Long, Guodong
    Chen, Weitong
    Li, Xue
    Sheng, Quan Z.
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (02): : 231 - 246
  • [32] Analysis Control Middleware for Large-Scale Video Surveillance
    Arikuma, Takeshi
    Koyama, Kazuya
    Kitano, Takatoshi
    Shiraishi, Nobuhisa
    Nagai, Yoichi
    Kawamata, Tsunehisa
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 294 - 299
  • [33] Compact representation for large-scale unconstrained video analysis
    Sen Wang
    Pingbo Pan
    Guodong Long
    Weitong Chen
    Xue Li
    Quan Z. Sheng
    [J]. World Wide Web, 2016, 19 : 231 - 246
  • [34] LoL-V2T: Large-Scale Esports Video Description Dataset
    Tanaka, Tsunehiko
    Simo-Serra, Edgar
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4552 - 4561
  • [35] SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams
    Abdrakhmanova, Madina
    Kuzdeuov, Askat
    Jarju, Sheikh
    Khassanov, Yerbolat
    Lewis, Michael
    Varol, Huseyin Atakan
    [J]. SENSORS, 2021, 21 (10)
  • [36] Measuring Instructional Differentiation in a Large-Scale Experiment
    Williams, Ryan T.
    Swanlund, Andrew
    Miller, Shazia
    Konstantopoulos, Spyros
    Eno, Jared
    van der Ploeg, Arie
    Meyers, Coby
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2014, 74 (02) : 263 - 279
  • [37] Introduction and Analysis of a Large-Scale Benchmark Automatic Vehicle Identification Dataset
    He, Zhaocheng
    Chen, Kaiying
    Chen, Xinyu
    Sun, Weiwei
    [J]. INTERNATIONAL CONFERENCE ON TRANSPORTATION AND DEVELOPMENT 2018: CONNECTED AND AUTONOMOUS VEHICLES AND TRANSPORTATION SAFETY, 2018, : 35 - 43
  • [38] Design and analysis of a large-scale COVID-19 tweets dataset
    Rabindra Lamsal
    [J]. Applied Intelligence, 2021, 51 : 2790 - 2804
  • [39] Fluorescence microscopy tensor imaging representations for large-scale dataset analysis
    Vinegoni, Claudio
    Feruglio, Paolo Fumene
    Courties, Gabriel
    Schmidt, Stephen
    Hulsmans, Maarten
    Lee, Sungon
    Wang, Rui
    Sosnovik, David
    Nahrendorf, Matthias
    Weissleder, Ralph
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [40] Design and analysis of a large-scale COVID-19 tweets dataset
    Lamsal, Rabindra
    [J]. APPLIED INTELLIGENCE, 2021, 51 (05) : 2790 - 2804