CGAN-based synthetic multivariate time-series generation: a solution to data scarcity in solar flare forecasting

被引:5
|
作者
Chen, Yang [1 ]
Kempton, Dustin J. [1 ]
Ahmadzadeh, Azim [1 ]
Wen, Junzhi [1 ]
Ji, Anli [1 ]
Angryk, Rafal A. [1 ]
机构
[1] Georgia State Univ, Atlanta, GA 30302 USA
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 16期
基金
美国国家科学基金会;
关键词
Multivariate time series; Class imbalance; Generative adversarial network; Flare forecasting;
D O I
10.1007/s00521-022-07361-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the major bottlenecks in refining supervised algorithms is data scarcity. This might be caused by a number of reasons often rooted in extremely expensive and lengthy data collection processes. In natural domains such as Heliophysics, it may take decades for sufficiently large samples for machine learning purposes. Inspired by the massive success of generative adversarial networks (GANs) in generating synthetic images, in this study we employed the conditional GAN (CGAN) on a recently released benchmark dataset tailored for solar flare forecasting. Our goal is to generate synthetic multivariate time-series data that (1) are statistically similar to the real data and (2) improve the performance of flare prediction when used to remedy the scarcity of strong flares. To evaluate the generated samples, first, we used the Kullback-Leibler divergence and adversarial accuracy measures to quantify the similarity between the real and synthetic data in terms of their descriptive statistics. Second, we evaluated the impact of the generated samples by training a predictive model on their descriptive statistics, which resulted in a significant improvement (over 1100% in TSS and 350% in HSS). Third, we used the generated time series to examine their high-dimensional contribution to mitigating the scarcity of the strong flares, which we also observed a significant improvement in terms of TSS (4%, 7%, and 31%) and HSS (75%, 35%, and 72%), compared to oversampling, undersampling, and synthetic oversampling methods, respectively. We believe our findings can open new doors toward more robust and accurate flare forecasting models.
引用
收藏
页码:13339 / 13353
页数:15
相关论文
共 50 条
  • [21] Distributed Synthetic Time-Series Data Generation With Local Differentially Private Federated Learning
    Jiang, Xue
    Zhou, Xuebing
    Grossklags, Jens
    IEEE ACCESS, 2024, 12 : 157067 - 157082
  • [22] Protect and Extend - Using GANs for Synthetic Data Generation of Time-Series Medical Records
    Ashrafi, Navid
    Schmitt, Vera
    Spang, Robert P.
    Moeller, Sebastian
    Voigt-Antons, Jan-Niklas
    2023 15TH INTERNATIONAL CONFERENCE ON QUALITY OF MULTIMEDIA EXPERIENCE, QOMEX, 2023, : 171 - 176
  • [23] TIformer: A Transformer-Based Framework for Time-Series Forecasting with Missing Data
    Ding, Zuocheng
    Chen, Yufan
    Wang, Hanchen
    Wang, Xiaoyang
    Zhang, Wenjie
    Zhang, Ying
    DATABASES THEORY AND APPLICATIONS, ADC 2024, 2025, 15449 : 71 - 84
  • [24] An Intelligent Time-Series Model for Forecasting Bus Passengers Based on Smartcard Data
    Cheng, Ching-Hsue
    Tsai, Ming-Chi
    Cheng, Yi-Chen
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [25] A Genetic-Based Backpropagation Neural Network for Forecasting in Time-Series Data
    Haviluddin
    Alfred, Rayner
    2015 INTERNATIONAL CONFERENCE ON SCIENCE IN INFORMATION TECHNOLOGY (ICSITECH), 2015, : 158 - 163
  • [26] Time-Series Power Forecasting for Wind and Solar Energy Based on the SL-Transformer
    Zhu, Jian
    Zhao, Zhiyuan
    Zheng, Xiaoran
    An, Zhao
    Guo, Qingwu
    Li, Zhikai
    Sun, Jianling
    Guo, Yuanjun
    ENERGIES, 2023, 16 (22)
  • [27] Precise and Accurate Short-term Forecasting of Solar Energetic Particle Events with Multivariate Time-series Classifiers
    Rotti, Sumanth A.
    Aydin, Berkay
    Martens, Petrus C.
    ASTROPHYSICAL JOURNAL, 2024, 974 (02):
  • [28] Neural networks based multivariate time series forecasting of solar radiation using meteorological data of different cities of Bangladesh
    Faisal, A. N. M. Fahim
    Rahman, Afikur
    Habib, Mohammad Tanvir Mahmud
    Siddique, Abdul Hasib
    Hasan, Mehedi
    Khan, Mohammad Monirujjaman
    RESULTS IN ENGINEERING, 2022, 13
  • [29] A scalable cyberinfrastructure solution to support big data management and multivariate visualization of time-series sensor observation data
    Wenwen Li
    Sheng Wu
    Miaomiao Song
    Xiran Zhou
    Earth Science Informatics, 2016, 9 : 449 - 464
  • [30] A scalable cyberinfrastructure solution to support big data management and multivariate visualization of time-series sensor observation data
    Li, Wenwen
    Wu, Sheng
    Song, Miaomiao
    Zhou, Xiran
    EARTH SCIENCE INFORMATICS, 2016, 9 (04) : 449 - 464