Prediction of moisture content in cigar tobacco leaves during the drying process based on random forest feature selection

被引:0
|
作者
Xing Z. [1 ]
Zhang K. [1 ]
Liu X. [2 ]
Ma M. [3 ]
Liu B. [1 ]
Ding S. [1 ]
Shi Y. [4 ]
An J. [1 ]
Gao H. [1 ]
Shi X. [1 ]
机构
[1] College of Tobacco Science, Henan Agricultural University, National Tobacco Cultivation and Physiology and Biochemistry Research Center, Key Laboratory for Tobacco Cultivation of Tobacco Industry, Zhengzhou
[2] Huaihua Tobacco Company of Hunan Province, Huaihua
[3] Zhengzhou Tobacco Research Institute, CNTC, Zhengzhou
[4] Cigar Research Institute, Anhui Zhongyan Industrial Co., Ltd., Anhui
来源
Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering | 2024年 / 40卷 / 07期
关键词
cigar leaf; feature selection; image processing; moisture content; random forest;
D O I
10.11975/j.issn.1002-6819.202311082
中图分类号
学科分类号
摘要
Airing process has been one of the most important stages in the production of cigar leaves. Also, the appearance quality can be enhanced to indicate the intrinsic quality. The temperature and humidity can be adjusted inside the drying chamber in real time, according to the moisture content of the leaves for the proper browning. However, the leaf moisture content is often determined by the manual experiences at present, resulting in subjectivity and low accuracy. Alternatively, computer vision can be expected to assess the quality of agricultural products in recent years, due to its simplicity and high flexibility. Additionally, the random forest (RF) model can serve as the bagging-based ensemble machine learning. The high-dimensional data variables can be efficiently handled with high precision, training and prediction speeds. In this study, the prediction models were established for the moisture content of cigar leaves using RF machine learning. "Yunxue-2" variety of cigar tobacco was taken as the research object. Initially, the images of cigar leaves were collected during the airing process. The crucial apparent feature was extracted to determine the moisture content of cigar leaves. The color threshold and OTSU segmentation were combined to obtain the leaf region of interest (ROI). Subsequently, four-dimensional features were extracted, including color, contour, texture, and location. The correlation coefficient analysis was employed to eliminate the highly correlated features within each feature dimension, in order to prevent "dimension explosion." Then, the out-of-bag (OOB) data was used to determine the average decrease in the coefficient of determination (Decr2). The importance of image features was ranked as well. A comparison was conducted on the prediction accuracy and runtime of the RF model under different feature quantities. The optimal subset of image features was selected as the seven image features that are closely related to the moisture content of cigar tobacco leaf. The original and optimal feature subsets were then used to evaluate the RF, support vector regression (SVR), and back propagation neural network (BPNN) models. Genetic algorithm (GA) was utilized to optimize the hyperparameters of each model. Three models were combined with the two sets of image features. Six model-feature combination schemes were then established. Five-fold cross validation was employed to compare the prediction accuracy and generalization. Subsequently, the performance of six schemes was verified on a test dataset during drying. The results demonstrated that the combination of color, contour, texture, and location features of cigar tobacco leaf images effectively characterized the changes in the appearance morphology under moisture loss. The combination of SVR and BPNN with the optimal image feature subset outperformed their combinations with the original one after five-fold cross-validation. While RF exhibited better performance on the original image feature set, leading to avoiding the information redundancy with high-dimensional data. The best performance on the test set was achieved in the combination of the GA-SVR model and optimal image feature subset, with r2 and MSE values of 0.980 and 0.001, respectively, with the shortest runtime (0.128 s). In summary, the image features of cigar tobacco leaf were utilized to accurately predict the moisture content of different parts in the entire drying. The finding can also provide the theoretical basis for the intelligent drying of cigar tobacco leaves. © 2024 Chinese Society of Agricultural Engineering. All rights reserved.
引用
收藏
页码:343 / 354
页数:11
相关论文
共 34 条
  • [11] WANG Lijuan, KONG Yuru, YANG Xiaodong, Et al., Classification of land use in farming areas based on feature optimization random forest algorithm, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 4, pp. 244-250, (2020)
  • [12] LI Hengkai, WANG Lijuan, XIAO Songsong, Random forest classification of land use in hilly and mountaineous areas of southern China using multi-source remote sensing data, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 37, 7, pp. 244-251, (2021)
  • [13] LIU Huanjun, ZHANG Meiwei, YANG Haoxuan, Et al., Invertion of cultivated soil organic matter content combining multi-spectral remote sensing and random forest algorithm, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 10, pp. 134-140, (2020)
  • [14] XIE W, WANG F, YANG D., Research on carrot surface defect detection methods based on machine vision, IFAC Conference on Sensing, Control and Automation Technologies for Agriculture, (2020)
  • [15] XIE W J, WEI Shuo, ZHENG Z H, Et al., Developing a stacked ensemble model for predicting the mass of fresh carrot, Postharvest Biology and Technology, 186, (2022)
  • [16] LIU Hao, MENG Lingfeng, WANG Songfeng, Et al., Optimization of fresh flue-cured tobacco maturity discrimination model based on machine vision, Journal of Chinese Agricultural Mechanization, 44, 8, pp. 118-124, (2023)
  • [17] LI Delun, ZHENG Hua, LI Guanglei, Et al., Quantification method of fresh tobacco leaves based on CIE chromatigraphy, Jiangsu Agricultural Sciences, 51, 15, pp. 141-148, (2023)
  • [18] GAO Yabei, ZHONG Qiu, WANG Songfeng, Et al., Changes in color and water content of tobacco leaves and their correlation during cigar wrapper airing, Chinese Tobacco Science, 40, 2, pp. 57-63, (2019)
  • [19] ZHAO Chen, ZHAO Haobin, LU Xiaochong, Et al., Green tobacco position identification method based on contour texture features and LDA, Journal of Henan Agricultural Sciences, 51, 10, pp. 161-168, (2022)
  • [20] WU Zhengmin, CAO Chengmao, WANG Errui, Et al., Tea selection method based on morphology feature parameters, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 35, 11, pp. 315-321, (2019)