Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in the Conterminous United States

被引:89
|
作者
Wang, Yumiao [1 ,2 ]
Zhang, Zhou [1 ]
Feng, Luwei [1 ,2 ]
Du, Qingyun [2 ,3 ,4 ,5 ]
Runge, Troy [1 ]
机构
[1] Univ Wisconsin, Biol Syst Engn, Madison, WI 53706 USA
[2] Wuhan Univ, Sch Resources & Environm Sci, Wuhan 430079, Peoples R China
[3] Wuhan Univ, Key Lab GIS, Minist Educ, Wuhan 430079, Peoples R China
[4] Wuhan Univ, Key Lab Digital Mapping & Land Informat Applicat, Natl Adm Surveying Mapping & Geoinformat, Wuhan 430079, Peoples R China
[5] Wuhan Univ, Collaborat Innovat Ctr Geospatial Technol, Wuhan 430079, Peoples R China
基金
美国食品与农业研究所;
关键词
Winter wheat; yield prediction; machine learning; multi-source data; CONUS; MODIS-NDVI; VEGETATION INDEXES; NEURAL-NETWORKS; MAIZE YIELD; MODEL; SATELLITE; TEMPERATURE; PERFORMANCE; SIMULATION; RESPONSES;
D O I
10.3390/rs12081232
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Winter wheat (Triticum aestivum L.) is one of the most important cereal crops, supplying essential food for the world population. Because the United States is a major producer and exporter of wheat to the world market, accurate and timely forecasting of wheat yield in the United States (U.S.) is fundamental to national crop management as well as global food security. Previous studies mainly have focused on developing empirical models using only satellite remote sensing images, while other yield determinants have not yet been adequately explored. In addition, these models are based on traditional statistical regression algorithms, while more advanced machine learning approaches have not been explored. This study used advanced machine learning algorithms to establish within-season yield prediction models for winter wheat using multi-source data to address these issues. Specifically, yield driving factors were extracted from four different data sources, including satellite images, climate data, soil maps, and historical yield records. Subsequently, two linear regression methods, including ordinary least square (OLS) and least absolute shrinkage and selection operator (LASSO), and four well-known machine learning methods, including support vector machine (SVM), random forest (RF), Adaptive Boosting (AdaBoost), and deep neural network (DNN), were applied and compared for estimating the county-level winter wheat yield in the Conterminous United States (CONUS) within the growing season. Our models were trained on data from 2008 to 2016 and evaluated on data from 2017 and 2018, with the results demonstrating that the machine learning approaches performed better than the linear regression models, with the best performance being achieved using the AdaBoost model (R-2 = 0.86, RMSE = 0.51 t/ha, MAE = 0.39 t/ha). Additionally, the results showed that combining data from multiple sources outperformed single source satellite data, with the highest accuracy being obtained when the four data sources were all considered in the model development. Finally, the prediction accuracy was also evaluated against timeliness within the growing season, with reliable predictions (R-2 > 0.84) being able to be achieved 2.5 months before the harvest when the multi-source data were combined.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Comments on "Measuring Housing Vitality from Multi-Source Big Data and Machine Learning"
    Tu, Wei
    Jiang, Bei
    Kong, Linglong
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (539) : 1060 - 1062
  • [42] Discussion of "Measuring Housing Vitality from Multi-Source Big Data and Machine Learning"
    Banerjee, Sudipto
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (539) : 1063 - 1065
  • [43] Machine Learning Modeling of Vitality Characteristics in Historical Preservation Zones with Multi-Source Data
    Huang, Xiaoran
    Gong, Pixin
    Wang, Siyan
    White, Marcus
    Zhang, Bo
    [J]. BUILDINGS, 2022, 12 (11)
  • [44] Multi-Source Data Analysis and Evaluation of Machine Learning Techniques for SQL Injection Detection
    Ross, Kevin
    Moh, Melody
    Moh, Teng-Sheng
    Yao, Jason
    [J]. ACMSE '18: PROCEEDINGS OF THE ACMSE 2018 CONFERENCE, 2018,
  • [45] Assessment of Forest Ecological Function Levels Based on Multi-Source Data and Machine Learning
    Fang, Ning
    Yao, Linyan
    Wu, Dasheng
    Zheng, Xinyu
    Luo, Shimei
    [J]. FORESTS, 2023, 14 (08):
  • [46] Multi-source satellite imagery and point of interest data for poverty mapping in East Java']Java, Indonesia: Machine learning and deep learning approaches
    Putri, Salwa Rizqina
    Wijayanto, Arie Wahyu
    Pramana, Setia
    [J]. REMOTE SENSING APPLICATIONS-SOCIETY AND ENVIRONMENT, 2023, 29
  • [47] Waterlogging risk assessment for winter wheat using multi-source data in the middle and lower reaches of Yangtze River
    Chen, Yuanyuan
    Huang, Jingfeng
    Song, Xiaodong
    Wu, Hongyan
    Sheng, Shaoxue
    Liu, Zhixiong
    Wang, Xiuzhen
    [J]. INTERNATIONAL JOURNAL OF AGRICULTURAL AND BIOLOGICAL ENGINEERING, 2018, 11 (05) : 198 - 205
  • [48] Deep Learning for Multi-Source Data-Driven Crop Yield Prediction in Northeast China
    Lu, Jian
    Li, Jian
    Fu, Hongkun
    Tang, Xuhui
    Liu, Zhao
    Chen, Hui
    Sun, Yue
    Ning, Xiangyu
    [J]. AGRICULTURE-BASEL, 2024, 14 (06):
  • [49] Research on the Quantitative Diagnosis of Drought Hazard Degree of Winter Wheat Using Multi-source Remote Sensing Data
    He, Haixia
    [J]. MIPPR 2015: REMOTE SENSING IMAGE PROCESSING, GEOGRAPHIC INFORMATION SYSTEMS, AND OTHER APPLICATIONS, 2015, 9815
  • [50] Evapotranspiration of Winter Wheat in the Semi-Arid Southeastern Loess Plateau Based on Multi-Source Satellite Data
    He, Peng
    Bi, Rutian
    Xu, Lishuai
    Liu, Zhengchun
    Yang, Fan
    Wang, Wenbiao
    Cui, Zhengnan
    Wang, Jingshu
    [J]. REMOTE SENSING, 2023, 15 (08)