Using Scalable Data Mining for Predicting Flight Delays

被引:62
|
作者
Belcastro, Loris [1 ]
Marozzo, Fabrizio [1 ]
Talia, Domenico [1 ]
Trunfio, Paolo [1 ]
机构
[1] Univ Calabria, DIMES, Arcavacata Di Rende, CS, Italy
关键词
Design; Algorithms; Performance; Cloud computing; big data; flight delay; scalability; open data; PROPAGATION;
D O I
10.1145/2888402
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Flight delays are frequent all over the world (about 20% of airline flights arrive more than 15min late) and they are estimated to have an annual cost of billions of dollars. This scenario makes the prediction of flight delays a primary issue for airlines and travelers. The main goal of this work is to implement a predictor of the arrival delay of a scheduled flight due to weather conditions. The predicted arrival delay takes into consideration both flight information (origin airport, destination airport, scheduled departure and arrival time) and weather conditions at origin airport and destination airport according to the flight timetable. Airline flight and weather observation datasets have been analyzed and mined using parallel algorithms implemented as MapReduce programs executed on a Cloud platform. The results show a high accuracy in predicting delays above a given threshold. For instance, with a delay threshold of 15min, we achieve an accuracy of 74.2% and 71.8% recall on delayed flights, while with a threshold of 60min, the accuracy is 85.8% and the delay recall is 86.9%. Furthermore, the experimental results demonstrate the predictor scalability that can be achieved performing data preparation and mining tasks as MapReduce applications on the Cloud.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] On Predicting and Analyzing Breast Cancer using Data Mining Approach
    Basunia, Masud Rana
    Pervin, Ismot Ara
    Al Mahmud, Md
    Saha, Suman
    Arifuzzaman, Mohammad
    [J]. 2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 1257 - 1260
  • [42] Predicting Trends in Air Pollution in Delhi using Data Mining
    Taneja, Shweta
    Sharma, Nidhi
    Oberoi, Kettun
    Navoria, Yash
    [J]. 2016 1ST INDIA INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (IICIP), 2016,
  • [43] DATA MINING IN PROMOTING FLIGHT SAFETY
    Sjoblom, Olli
    [J]. INJURY PREVENTION, 2016, 22 : A313 - A314
  • [44] Predicting GPA and Academic Dismissal in LMS Using Educational Data Mining: A Case Mining
    Nasiri, Mahdi
    Minaei, Behrouz
    Vafaei, Fereydoon
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON E-LEARNING AND E-TEACHING (ICELET), 2012, : 53 - 58
  • [45] Discovering Anomalous Aviation Safety Events Using Scalable Data Mining Algorithms
    Matthews, Bryan
    Das, Santanu
    Bhaduri, Kanishka
    Das, Kamalika
    Martin, Rodney
    Oza, Nikunj
    [J]. JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2013, 10 (10): : 467 - 475
  • [46] Scalable parallel data mining for association rules
    Han, EH
    Karypis, G
    Kumar, V
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (03) : 337 - 352
  • [47] Scalable and Efficient Data Analytics and Mining with Lemonade
    dos Santos, Walter
    Avelar, Gustavo P.
    Ribeiro, Manoel Horta
    Guedes, Dorgival
    Meira Jr, Wagner
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (12): : 2070 - 2073
  • [48] Scalable, Reliable and Robust Data Mining Infrastructures
    Pawar, Shrikant
    Stanam, Aditya
    [J]. PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 123 - 125
  • [49] Compiler and middleware support for scalable data mining
    Agrawal, G
    Jin, RM
    Li, XG
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2003, 2624 : 33 - 51
  • [50] On Scalable Data Mining Techniques for Earth Science
    Goetz, Markus
    Richerzhagen, Matthias
    Bodenstein, Christian
    Cavallaro, Gabriele
    Glock, Philipp
    Riedel, Morris
    Benediktsson, Jon Atli
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 2188 - 2197