ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering

被引:0
|
作者
Yu, Zhou [1 ]
Xu, Dejing [2 ]
Yu, Jun [1 ]
Yu, Ting [1 ]
Zhao, Zhou [2 ]
Zhuang, Yueting [2 ]
Tao, Dacheng [3 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou, Zhejiang, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci, Hangzhou, Zhejiang, Peoples R China
[3] Univ Sydney, FEIT, SIT, UBTECH Sydney AI Ctr, Sydney, NSW, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent developments in modeling language and vision have been successfully applied to image question answering. It is both crucial and natural to extend this research direction to the video domain for video question answering (VideoQA). Compared to the image domain where large scale and fully annotated benchmark datasets exists, VideoQA datasets are limited to small scale and are automatically generated, etc. These limitations restrict their applicability in practice. Here we introduce ActivityNet-QA, a fully annotated and large scale VideoQA dataset. The dataset consists of 58,000 QA pairs on 5,800 complex web videos derived from the popular ActivityNet dataset. We present a statistical analysis of our ActivityNet-QA dataset and conduct extensive experiments on it by comparing existing VideoQA baselines. Moreover, we explore various video representation strategies to improve VideoQA performance, especially for long videos.
引用
收藏
页码:9127 / 9134
页数:8
相关论文
共 41 条
  • [1] DISFL-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
    Gupta, Aditya
    Xu, Jiacheng
    Upadhyay, Shyam
    Yang, Diyi
    Faruqui, Manaal
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3309 - 3319
  • [2] TutorialVQA: Question Answering Dataset for Tutorial Videos
    Colas, Anthony
    Kim, Seokhwan
    Dernoncourt, Franck
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5450 - 5455
  • [3] AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry
    Katsis, Yannis
    Chemmengath, Saneem
    Kumar, Vishwajeet
    Bharadwaj, Samarth
    Canim, Mustafa
    Glass, Michael
    Gliozzo, Alfio
    Pan, Feifei
    Sen, Jaydeep
    Sankaranarayanan, Karthik
    Chakrabarti, Soumen
    [J]. 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 305 - 314
  • [4] AVQA: A Dataset for Audio-Visual Question Answering on Videos
    Yang, Pinci
    Wang, Xin
    Duan, Xuguang
    Chen, Hong
    Hou, Runze
    Jin, Cong
    Zhu, Wenwu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3480 - 3491
  • [5] XMQAs: Constructing Complex-Modified Question-Answering Dataset for Robust Question Understanding
    Chen, Yuyan
    Xiao, Yanghua
    Li, Zhixu
    Liu, Bang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1371 - 1384
  • [6] JEC-QA: A Legal-Domain Question Answering Dataset
    Zhong, Haoxi
    Xiao, Chaojun
    Tu, Cunchao
    Zhang, Tianyang
    Liu, Zhiyuan
    Sun, Maosong
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9701 - 9708
  • [7] ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram
    Oh, Jungwoo
    Lee, Gyubok
    Bae, Seongsu
    Kwon, Joon-Myoung
    Choi, Edward
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Transformer Models for Question Answering on Autism Spectrum Disorder QA Dataset
    Firsanova, Victoria
    [J]. DIGITAL TRANSFORMATION AND GLOBAL SOCIETY, DTGS 2021, 2022, 1503 : 122 - 133
  • [9] Question Answering on web data: the QA evaluation in Quæro
    Quintard, Ludovic
    Galibert, Olivier
    Adda, Gilles
    Grau, Brigitte
    Laurent, Dominique
    Moriceau, Veronique
    Rosset, Sophie
    Tannier, Xavier
    Vilnat, Anne
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2368 - 2374
  • [10] QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
    Rogers, Anna
    Gardner, Matt
    Augenstein, Isabelle
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (10)