AttractionDetailsQA: An Attraction Details Focused on Chinese Question Answering Dataset

被引：1

作者：

Huang, Weiming ^{[1
,2
]}

Xu, Shiting ^{[3
]}

Wang Yuhan ^{[4
]}

Jin Fan ^{[1
,2
]}

Chang, Qingling ^{[1
,2
]}

机构：

[1] Wuyi Univ, Fac Intelligent Mfg, Jiangmen 529000, Peoples R China

[2] China Germany Artificial Intelligence Inst Jiangm, Jiangmen 529000, Peoples R China

[3] Zhuhai 4DAGE Technol Co Ltd, Zhuhai 519000, Peoples R China

[4] Jiangsu Univ Sci & Technol, Sch Naval Architecture & Ocean Engn, Zhenjiang 212003, Jiangsu, Peoples R China

来源：

IEEE ACCESS | 2022年 / 10卷

关键词：

Annotations; Data models; Question answering (information retrieval); Manuals; Layout; Benchmark testing; Tourism industry; Attraction detail dataset; question-answering pair generation;

D O I：

10.1109/ACCESS.2022.3181188

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the increase in the number of domestic tourists and the popularity of digital upgrades in attractions, it is crucial to develop a question-answering(QA) system about the details of the attractions. However, there is little work on attractions QA, and the main bottleneck is the lack of available datasets. While previous QA datasets usually focus on news domain like CNN/DAILYMAIL and NewsQA, we present the first large-scale dataset for QA over attraction details. To ensure that the data we collected are useful, we only gather the data from public travel information website. Unlike other QA datasets like SQuAD, which is labeled manually, we formed the dataset by manual and question-answer pair generation(QAG) annotated model. Finally, we obtained a dataset covering 2,808 attractions with a total of 18,245 QA pairs, including seven types of attraction details: location, time, component, area, layout, rating, and character. The dataset is available at https://github.com/wyman130/AttractionDetailsQA. Considering that QAG has not been much studied in attraction details, we experimented some QAG models on this dataset and obtained the benchmark. This provides a basis for subsequent improvements to the dataset and research on QAG in attraction details.

引用

页码：86215 / 86221

页数：7

共 50 条

[31] Question Classification for Chinese Cuisine Question Answering System
Xia, Ling
Teng, Zhi
Ren, Fuji
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2009, 4 (06) : 689 - 695
[32] Improvisation of Dataset Efficiency in Visual Question Answering Domain
Mohamed, Sheerin Sitara Noor
Srinivasan, Kavitha
STATISTICS AND APPLICATIONS, 2022, 20 (02): : 279 - 289
[33] Dataset bias: A case study for visual question answering
Das A.
Anjum S.
Gurari D.
Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
[34] EgoVQA - An Egocentric Video Question Answering Benchmark Dataset
Fan, Chenyou
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4359 - 4366
[35] RuBQ 2.0: An Innovated Russian Question Answering Dataset
Rybin, Ivan
Korablinov, Vladislav
Efimov, Pavel
Braslavski, Pavel
SEMANTIC WEB, ESWC 2021, 2021, 12731 : 532 - 547
[36] Building a benchmark dataset for the Kurdish news question answering
Saeed, Ari M.
DATA IN BRIEF, 2024, 57
[37] A dataset for medical instructional video classification and question answering
Gupta, Deepak
Attal, Kush
Demner-Fushman, Dina
SCIENTIFIC DATA, 2023, 10 (01)
[38] OVQA: A Clinically Generated Visual Question Answering Dataset
Huang, Yefan
Wang, Xiaoli
Liu, Feiyan
Huang, Guofeng
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2924 - 2938
[39] A Large Visual Question Answering Dataset for Cultural Heritage
Asprino, Luigi
Bulla, Luana
Marinucci, Ludovica
Mongiovi, Misael
Presutti, Valentina
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 193 - 197
[40] DermaVQA: A Multilingual Visual Question Answering Dataset for Dermatology
Yim, Wen-wai
Fu, Yujuan
Sun, Zhaoyi
Ben Abacha, Asma
Yetisgen, Meliha
Xia, Fei
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 209 - 219

← 1 2 3 4 5 →