BjTT: A Large-Scale Multimodal Dataset for Traffic Prediction

被引:0
|
作者
Zhang, Chengyang [1 ]
Zhang, Yong [1 ]
Shao, Qitan [1 ]
Feng, Jiangtao [1 ]
Li, Bo [1 ]
Lv, Yisheng [2 ]
Piao, Xinglin [1 ]
Yin, Baocai [1 ]
机构
[1] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Sch Informat Sci & Technol, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Roads; Social networking (online); Transportation; Data collection; Task analysis; Blogs; Meteorology; Traffic prediction; large-scale; new dataset; FLOW; NETWORKS; MODELS;
D O I
10.1109/TITS.2024.3440650
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Traffic prediction plays a significant role in Intelligent Transportation Systems (ITS). Although many datasets have been introduced to support the study of traffic prediction, most of them only provide time-series traffic data. However, urban transportation systems are always susceptible to various factors, including unusual weather and traffic accidents. Therefore, relying solely on historical data for traffic prediction greatly limits the accuracy of the prediction. In this paper, we introduce Beijing Text-Traffic (BjTT), a large-scale multimodal dataset for traffic prediction. BjTT comprises over 32,000 time-series traffic records, capturing velocity and congestion levels on more than 1,200 roads within the 5th ring area of Beijing. Meanwhile, each piece of traffic data is coupled with a text describing the traffic system (including time, location, and events). We detail the data collection and processing procedures and present a statistical analysis of the BjTT dataset. Furthermore, we conduct comprehensive experiments on the dataset with state-of-the-art traffic prediction methods and text-guided generative models, which reveal the unique characteristics of the BjTT. The dataset is available at https://github.com/ChyaZhang/BjTT.
引用
收藏
页码:18992 / 19003
页数:12
相关论文
共 50 条
  • [41] SGF: A Crowdsourced Large-scale Event Dataset
    Heuschkel, Jens
    Froemmgen, Alexander
    [J]. PROCEEDINGS OF THE 9TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'18), 2018, : 351 - 356
  • [42] MineRL: A Large-Scale Dataset of Minecraft Demonstrations
    Guss, William H.
    Houghton, Brandon
    Topin, Nicholay
    Wang, Phillip
    Codel, Cayden
    Veloso, Manuela
    Salakhutdinov, Ruslan
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2442 - 2448
  • [43] A large-scale dataset of buildings and construction sites
    Cheng, Xuanhao
    Jia, Mingming
    He, Jian
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (09) : 1390 - 1406
  • [44] Pchatbot: A Large-Scale Dataset for Personalized Chatbot
    Qian, Hongjin
    Li, Xiaohe
    Zhong, Hanxun
    Guo, Yu
    Ma, Yueyuan
    Zhu, Yutao
    Liu, Zhanliang
    Dou, Zhicheng
    Wen, Ji-Rong
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2470 - 2477
  • [45] openDD: A Large-Scale Roundabout Drone Dataset
    Breuer, Antonia
    Termoehlen, Jan-Aike
    Homoceanu, Silviu
    Fingscheidt, Tim
    [J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
  • [46] PatchDB: A Large-Scale Security Patch Dataset
    Wang, Xinda
    Wang, Shu
    Feng, Pengbin
    Sun, Kun
    Jajodia, Sushil
    [J]. 51ST ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN 2021), 2021, : 149 - 160
  • [47] A large-scale hyperspectral dataset for flower classification
    Zheng, Yongrong
    Zhang, Tao
    Fu, Ying
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 236
  • [48] A Large-Scale Dataset for Empathetic Response Generation
    Welivita, Anuradha
    Xie, Yubo
    Pu, Pearl
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1251 - 1264
  • [49] EdNet: A Large-Scale Hierarchical Dataset in Education
    Choi, Youngduck
    Lee, Youngnam
    Shin, Dongmin
    Cho, Junghyun
    Park, Seoyon
    Lee, Seewoo
    Baek, Jineon
    Bae, Chan
    Kim, Byungsoo
    Heo, Jaewe
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 69 - 73
  • [50] A large-scale and global car dataset for verification
    Hu, Lingji
    Luo, Xingcheng
    Deng, Jianhua
    Lai, Fengjie
    Hu, Jian
    Yu, Yongbin
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ELECTRONIC TECHNOLOGY, 2016, 48 : 49 - 52