CarbonNet: Enterprise-Level Carbon Emission Prediction with Large-Scale Datasets

被引:0
|
作者
Tang, Jinghua [1 ]
Fang, Nan [1 ]
Yang, Lanqing [2 ]
Pei, Yuqiao [2 ]
Wang, Ran [2 ]
Ding, Dian [2 ]
Lu, Yu [2 ]
Xue, Guangtao [2 ]
机构
[1] Shanghai Voicecomm Technol Co Ltd, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
关键词
Enterprise-level carbon emissions prediction; Factor analysis; Big data mining; CO2; EMISSIONS; PROJECTIONS; CHINA;
D O I
10.1007/978-981-97-5615-5_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The precise prediction of carbon emissions is crucial to combat global climate change and foster sustainable development. Conventional carbon emissions forecasting usually relies on limited records of national or regional levels, which is coarse and compromises its accuracy. Furthermore, the data collection requires lengthy statistical cycles and high costs, making it too costly to provide timely feedback from the forecasting. Additionally, they are targeted at specific fields thus hard to construct universal models. To overcome these challenges, we propose CarbonNet, a novel firm-level carbon emission prediction scheme. To build large-scale firm-level datasets, we crawled carbon emission data and reporting data (e.g., financial statements) of 3346 companies over 31 years containing 688 data fields, and combined them together. A preprocessing scheme is proposed to aggregate data with different statistical intervals or sources, and many outliers. A factor-analysis-based features extraction scheme is proposed to build a generalized forecasting model for different types of companies. A machine learning scheme is proposed for big data mining and long-term forecasting. We evaluated CarbonNet on real-world datasets. Results show that it achieves a median relative error of 0.25, outperforming others by 22%. The corresponding carbon emissions dataset has been made publicly available to advance related research.
引用
收藏
页码:411 / 422
页数:12
相关论文
共 50 条
  • [31] RETRACTED: Large-Scale Textual Datasets and Deep Learning for the Prediction of Depressed Symptoms (Retracted Article)
    Chakraborty, Sudeshna
    Mahdi, Hussain Falih
    Al-Abyadh, Mohammed Hasan Ali
    Pant, Kumud
    Sharma, Aditi
    Ahmadi, Fardin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [32] Comprehensive comparison of large-scale tissue expression datasets
    Santos, Alberto
    Tsafou, Kalliopi
    Stolte, Christian
    Pletscher-Frankild, Sune
    O'Donoghue, Sean I.
    Jensen, Lars Juhl
    PEERJ, 2015, 3
  • [33] GUILD - A Generator for Usable Images in Large-Scale Datasets
    Roch, Peter
    Nejad, Bijan Shahbaz
    Handte, Marcus
    Marron, Pedro Jose
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT II, 2022, 13599 : 245 - 258
  • [34] A Distributed Approach for Parsing Large-scale OWL Datasets
    Mohamed, Heba
    Fathalla, Said
    Lehmann, Jens
    Jabeen, Hajira
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 227 - 234
  • [35] Face Retrieval in Large-Scale News Video Datasets
    Thanh Duc Ngo
    Hung Thanh Vu
    Duy-Dinh Le
    Satoh, Shin'ichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (08): : 1811 - 1825
  • [36] Parallel Framework for Dimensionality Reduction of Large-Scale Datasets
    Samudrala, Sai Kiranmayee
    Zola, Jaroslaw
    Aluru, Srinivas
    Ganapathysubramanian, Baskar
    SCIENTIFIC PROGRAMMING, 2015, 2015
  • [37] Will Large-scale Generative Models Corrupt Future Datasets?
    Hataya, Ryuichiro
    Bao, Han
    Arai, Hiromi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20498 - 20508
  • [38] Large-scale palm vein recognition on synthetic datasets
    Hernandez-Garcia, Ruber
    Santamaria, Jose, I
    Barrientos, Ricardo J.
    Salazar Jurado, Edwin H.
    Manuel Castro, Francisco
    Ramos-Cozar, Julian
    Guil, Nicolas
    2021 40TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2021,
  • [39] Scalable Iterative Classification for Sanitizing Large-Scale Datasets
    Li, Bo
    Vorobeychik, Yevgeniy
    Li, Muqun
    Malin, Bradley
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (03) : 698 - 711
  • [40] TIPP: Parallel Delaunay Triangulation for Large-Scale Datasets
    Nguyen, Cuong
    Rhodes, Philip J.
    30TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2018), 2018,