Airport Visibility Classification Based on Multimodal Fusion of Image-Tabular Data

被引:0
|
作者
Wang, Laijun [1 ]
Cui, Zhiwei [1 ,2 ]
Dong, Shi [1 ,2 ]
Wang, Ning [1 ]
机构
[1] Changan Univ, Coll Transportat Engn, Xian 710064, Peoples R China
[2] Changan Univ, Engn Res Ctr Highway Infrastruct Digitalizat, Minist Educ PRC, Xian 710064, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Airport visibility classification; multimodal fusion; FT-Transformer; EfficientNet; multilayer perceptron; PREDICTION; EFFICIENT;
D O I
10.1109/ACCESS.2024.3482969
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Low visibility is a major cause of flight delays and airport cancellations. Hence, providing accurate prediction of airport visibility is vital to prevent significant losses for airlines, accordingly to avoid catastrophic aviation accidents. Although many unimodal prediction methods have been developed, there is still room for improvement in airport visibility classification. Therefore, this study proposed an airport visibility classification model that employs a multimodal fusion of image and tabular data to improve classification performance. First, an enhanced image processing method on airport visibility was designed to extract more detailed features related to airport visibility. Next, EfficientNetB1 and Feature Tokenizer Transformer (FT-Transformer) were utilized to extract features from the images and tabular data, respectively. These features were then combined to classify airport visibility using a Multimodal Fusion Multilayer Perceptron with a focal loss function, validated through 5-fold stratified cross-validation. Experimental results on the 1864 pairs of image-tabular data from Nanjing Lukou International Airport showed that our model achieved an accuracy of 93.83%, and a Ma-F1 of 91.64%. Comparisons with various image extraction methods indicated that EfficientNetB1 with integrated images provides the best performance and shorter running time. The comparison of multimodal fusion and unimodal image classifications revealed that the accuracy and Ma-F1 of unimodal image classifications are lower than those for multimodal classification by 8.56% and 12.85%. In addition, ablation experiments also proved the effectiveness of the enhanced image processing and multimodal fusion modules.
引用
收藏
页码:155082 / 155097
页数:16
相关论文
共 50 条
  • [1] YouTube thumbnail design recommendation systems using image-tabular multimodal data for Thai's YouTube thumbnail
    Pornpanvattana, Anyamanee
    Lertakkakorn, Metpiya
    Pookpanich, Peerat
    Vitheethum, Khodchapan
    Siriborvornratanakul, Thitirat
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [2] Multimodal AutoML for Image, Text and Tabular Data
    Erickson, Nick
    Shi, Xingjian
    Sharpnack, James
    Smola, Alex
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4786 - 4787
  • [3] TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data
    Dui, Siyi
    Zheng, Shaoming
    Wang, Yinsong
    Bai, Wenjia
    O'Regan, Declan P.
    Qin, Chen
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 478 - 496
  • [4] Image classification based on data fusion
    Zhao, ZG
    Chen, XQ
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 2675 - 2678
  • [5] Weather Visibility Prediction Based on Multimodal Fusion
    Zhang, Chuang
    Wu, Ming
    Chen, Jinyu
    Chen, Kaiyan
    Zhang, Chi
    Xie, Chao
    Huang, Bin
    He, Zichen
    IEEE ACCESS, 2019, 7 : 74776 - 74786
  • [6] Multimodal Taste Classification of Chinese Recipe Based on Image and Text Fusion
    Chen Yawei
    Cao Min
    Gao Wenjing
    2020 5TH INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA 2020), 2020, : 203 - 208
  • [7] Neutrosophic-CNN-based image and text fusion for multimodal classification
    Wajid, Mohd Anas
    Zafar, Aasim
    Terashima-Marin, Hugo
    Saif Wajid, Mohammad
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) : 1039 - 1055
  • [8] MCFT: Multimodal Contrastive Fusion Transformer for Classification of Hyperspectral Image and LiDAR Data
    Feng, Yining
    Jin, Jiarui
    Yin, Yin
    Song, Chuanming
    Wang, Xianghai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [9] MDFNet: application of multimodal fusion method based on skin image and clinical data to skin cancer classification
    Chen, Qian
    Li, Min
    Chen, Chen
    Zhou, Panyun
    Lv, Xiaoyi
    Chen, Cheng
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (07) : 3287 - 3299
  • [10] MDFNet: application of multimodal fusion method based on skin image and clinical data to skin cancer classification
    Qian Chen
    Min Li
    Chen Chen
    Panyun Zhou
    Xiaoyi Lv
    Cheng Chen
    Journal of Cancer Research and Clinical Oncology, 2023, 149 : 3287 - 3299