ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering

被引：0

作者：

Yu, Zhou ^{[1
]}

Xu, Dejing ^{[2
]}

Yu, Jun ^{[1
]}

Yu, Ting ^{[1
]}

Zhao, Zhou ^{[2
]}

Zhuang, Yueting ^{[2
]}

Tao, Dacheng ^{[3
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou, Zhejiang, Peoples R China

[2] Zhejiang Univ, Coll Comp Sci, Hangzhou, Zhejiang, Peoples R China

[3] Univ Sydney, FEIT, SIT, UBTECH Sydney AI Ctr, Sydney, NSW, Australia

来源：

THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年

基金：

澳大利亚研究理事会; 中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent developments in modeling language and vision have been successfully applied to image question answering. It is both crucial and natural to extend this research direction to the video domain for video question answering (VideoQA). Compared to the image domain where large scale and fully annotated benchmark datasets exists, VideoQA datasets are limited to small scale and are automatically generated, etc. These limitations restrict their applicability in practice. Here we introduce ActivityNet-QA, a fully annotated and large scale VideoQA dataset. The dataset consists of 58,000 QA pairs on 5,800 complex web videos derived from the popular ActivityNet dataset. We present a statistical analysis of our ActivityNet-QA dataset and conduct extensive experiments on it by comparing existing VideoQA baselines. Moreover, we explore various video representation strategies to improve VideoQA performance, especially for long videos.

引用

页码：9127 / 9134

页数：8

共 41 条

[1] DISFL-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
Gupta, Aditya
Xu, Jiacheng
Upadhyay, Shyam
Yang, Diyi
Faruqui, Manaal
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3309 - 3319
[2] TutorialVQA: Question Answering Dataset for Tutorial Videos
Colas, Anthony
Kim, Seokhwan
Dernoncourt, Franck
[J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5450 - 5455
[3] AIT-QA: Question Answering Dataset over Complex Tables in the Airline Industry
Katsis, Yannis
Chemmengath, Saneem
Kumar, Vishwajeet
Bharadwaj, Samarth
Canim, Mustafa
Glass, Michael
Gliozzo, Alfio
Pan, Feifei
Sen, Jaydeep
Sankaranarayanan, Karthik
Chakrabarti, Soumen
[J]. 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 305 - 314
[4] AVQA: A Dataset for Audio-Visual Question Answering on Videos
Yang, Pinci
Wang, Xin
Duan, Xuguang
Chen, Hong
Hou, Runze
Jin, Cong
Zhu, Wenwu
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3480 - 3491
[5] XMQAs: Constructing Complex-Modified Question-Answering Dataset for Robust Question Understanding
Chen, Yuyan
Xiao, Yanghua
Li, Zhixu
Liu, Bang
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1371 - 1384
[6] JEC-QA: A Legal-Domain Question Answering Dataset
Zhong, Haoxi
Xiao, Chaojun
Tu, Cunchao
Zhang, Tianyang
Liu, Zhiyuan
Sun, Maosong
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9701 - 9708
[7] ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram
Oh, Jungwoo
Lee, Gyubok
Bae, Seongsu
Kwon, Joon-Myoung
Choi, Edward
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[8] Transformer Models for Question Answering on Autism Spectrum Disorder QA Dataset
Firsanova, Victoria
[J]. DIGITAL TRANSFORMATION AND GLOBAL SOCIETY, DTGS 2021, 2022, 1503 : 122 - 133
[9] Question Answering on web data: the QA evaluation in Quæro
Quintard, Ludovic
Galibert, Olivier
Adda, Gilles
Grau, Brigitte
Laurent, Dominique
Moriceau, Veronique
Rosset, Sophie
Tannier, Xavier
Vilnat, Anne
[J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2368 - 2374
[10] QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
Rogers, Anna
Gardner, Matt
Augenstein, Isabelle
[J]. ACM COMPUTING SURVEYS, 2023, 55 (10)

← 1 2 3 4 5 →