AttractionDetailsQA: An Attraction Details Focused on Chinese Question Answering Dataset

被引:1
|
作者
Huang, Weiming [1 ,2 ]
Xu, Shiting [3 ]
Wang Yuhan [4 ]
Jin Fan [1 ,2 ]
Chang, Qingling [1 ,2 ]
机构
[1] Wuyi Univ, Fac Intelligent Mfg, Jiangmen 529000, Peoples R China
[2] China Germany Artificial Intelligence Inst Jiangm, Jiangmen 529000, Peoples R China
[3] Zhuhai 4DAGE Technol Co Ltd, Zhuhai 519000, Peoples R China
[4] Jiangsu Univ Sci & Technol, Sch Naval Architecture & Ocean Engn, Zhenjiang 212003, Jiangsu, Peoples R China
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Annotations; Data models; Question answering (information retrieval); Manuals; Layout; Benchmark testing; Tourism industry; Attraction detail dataset; question-answering pair generation;
D O I
10.1109/ACCESS.2022.3181188
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increase in the number of domestic tourists and the popularity of digital upgrades in attractions, it is crucial to develop a question-answering(QA) system about the details of the attractions. However, there is little work on attractions QA, and the main bottleneck is the lack of available datasets. While previous QA datasets usually focus on news domain like CNN/DAILYMAIL and NewsQA, we present the first large-scale dataset for QA over attraction details. To ensure that the data we collected are useful, we only gather the data from public travel information website. Unlike other QA datasets like SQuAD, which is labeled manually, we formed the dataset by manual and question-answer pair generation(QAG) annotated model. Finally, we obtained a dataset covering 2,808 attractions with a total of 18,245 QA pairs, including seven types of attraction details: location, time, component, area, layout, rating, and character. The dataset is available at https://github.com/wyman130/AttractionDetailsQA. Considering that QAG has not been much studied in attraction details, we experimented some QAG models on this dataset and obtained the benchmark. This provides a basis for subsequent improvements to the dataset and research on QAG in attraction details.
引用
收藏
页码:86215 / 86221
页数:7
相关论文
共 50 条
  • [31] Question Classification for Chinese Cuisine Question Answering System
    Xia, Ling
    Teng, Zhi
    Ren, Fuji
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2009, 4 (06) : 689 - 695
  • [32] Improvisation of Dataset Efficiency in Visual Question Answering Domain
    Mohamed, Sheerin Sitara Noor
    Srinivasan, Kavitha
    STATISTICS AND APPLICATIONS, 2022, 20 (02): : 279 - 289
  • [33] Dataset bias: A case study for visual question answering
    Das A.
    Anjum S.
    Gurari D.
    Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
  • [34] EgoVQA - An Egocentric Video Question Answering Benchmark Dataset
    Fan, Chenyou
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4359 - 4366
  • [35] RuBQ 2.0: An Innovated Russian Question Answering Dataset
    Rybin, Ivan
    Korablinov, Vladislav
    Efimov, Pavel
    Braslavski, Pavel
    SEMANTIC WEB, ESWC 2021, 2021, 12731 : 532 - 547
  • [36] Building a benchmark dataset for the Kurdish news question answering
    Saeed, Ari M.
    DATA IN BRIEF, 2024, 57
  • [37] A dataset for medical instructional video classification and question answering
    Gupta, Deepak
    Attal, Kush
    Demner-Fushman, Dina
    SCIENTIFIC DATA, 2023, 10 (01)
  • [38] OVQA: A Clinically Generated Visual Question Answering Dataset
    Huang, Yefan
    Wang, Xiaoli
    Liu, Feiyan
    Huang, Guofeng
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2924 - 2938
  • [39] A Large Visual Question Answering Dataset for Cultural Heritage
    Asprino, Luigi
    Bulla, Luana
    Marinucci, Ludovica
    Mongiovi, Misael
    Presutti, Valentina
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 : 193 - 197
  • [40] DermaVQA: A Multilingual Visual Question Answering Dataset for Dermatology
    Yim, Wen-wai
    Fu, Yujuan
    Sun, Zhaoyi
    Ben Abacha, Asma
    Yetisgen, Meliha
    Xia, Fei
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 209 - 219