Product Length Predictions with Machine Learning: An Integrated Approach Using Extreme Gradient Boosting

被引:0
|
作者
Thakur A. [1 ]
Kumar A. [1 ]
Mishra S.K. [1 ]
Behera S.K. [2 ]
Sethi J. [3 ]
Sahu S.S. [4 ]
Swain S.K. [1 ]
机构
[1] Department of Electrical and Electronics Engineering, Birla Institute of Technology Mesra, Ranchi
[2] Department of Electronics and Telecommunication Engineering, DRIEMS Autonomous Engineering College, Tangi, Odisha, Cuttack
[3] Department of Electronics and Instrumentation Engineering, Odisha University of Technology and Research, Techno Campus, Ghatikia, Odisha, Bhubaneswar
[4] Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, Jharkhand, Ranchi
关键词
Interquartile Range (IQR); Natural Language Processing (NLP); Predictive Modeling; Term Frequency-Inverse Document Frequency (TF-IDF); XGBoost;
D O I
10.1007/s42979-024-02999-8
中图分类号
学科分类号
摘要
The study aims to introduce a novel machine learning approach for the prediction of product lengths by addressing diverse data types including numeric, textual and categorical data and extracting valuable information from the dataset to enhance prediction accuracy. This is achieved by employing methods that combine text vectorization, gradient boosting algorithm and feature encoding of categorical data, specifically using Term Frequency-Inverse Document Frequency (TF-IDF), eXtreme Gradient Boosting (XGBoost) and target encoding. Our method begins with thorough data preparation, removing outliers and filling in missing values, then extracts important features from product titles, descriptions, and bullet points present in the dataset. We convert text from product titles, descriptions, and bullet points into numerical form using the TF-IDF technique. It captures the weighted frequency of words in the form of TF-IDF feature vectors enabling the effective application of the algorithm. Our training process employs RandomizedSearchCV to optimize the XGBoost model’s hyperparameters utilizing TF-IDF vectors and target encoded product type IDs. This allows the model to effectively handle variability and uncertainty for product length predictions. The techniques used contribute to the adaptability of the method and enable accurate prediction of product length in e-commerce which can be helpful in inventory management across diverse products. This can extend their utility to optimize supply chain operations, improving demand forecasting across a variety of products, and aiding in strategic planning for procurement and stock levels. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.
引用
收藏
相关论文
共 50 条
  • [1] Extreme Learning Machine Enhanced Gradient Boosting for Credit Scoring
    Zou, Yao
    Gao, Changchun
    ALGORITHMS, 2022, 15 (05)
  • [2] Lithium-Ion Battery Estimation in Online Framework Using Extreme Gradient Boosting Machine Learning Approach
    Jafari, Sadiqa
    Shahbazi, Zeinab
    Byun, Yung-Cheol
    Lee, Sang-Joon
    MATHEMATICS, 2022, 10 (06)
  • [3] Time Series Analysis and Forecasting of Solar Generation in Spain Using eXtreme Gradient Boosting: A Machine Learning Approach
    Saigustia, Candra
    Pijarski, Pawel
    ENERGIES, 2023, 16 (22)
  • [4] Using extreme gradient boosting (XGBoost) machine learning to predict construction cost overruns
    Coffie, G. H.
    Cudjoe, S. K. F.
    INTERNATIONAL JOURNAL OF CONSTRUCTION MANAGEMENT, 2024, 24 (16) : 1742 - 1750
  • [5] Machine Learning Diffuse Optical Tomography Using Extreme Gradient Boosting and Genetic Programming
    Hauptman, Ami
    Balasubramaniam, Ganesh M.
    Arnon, Shlomi
    BIOENGINEERING-BASEL, 2023, 10 (03):
  • [6] Extreme Gradient Boosting-Based Machine Learning Approach for Green Building Cost Prediction
    Alshboul, Odey
    Shehadeh, Ali
    Almasabha, Ghassan
    Almuflih, Ali Saeed
    SUSTAINABILITY, 2022, 14 (11)
  • [7] Hybrid machine learning approach for construction cost estimation: an evaluation of extreme gradient boosting model
    Ali Z.H.
    Burhan A.M.
    Asian Journal of Civil Engineering, 2023, 24 (7) : 2427 - 2442
  • [8] Mortality predictors in patients with COVID-19 pneumonia: a machine learning approach using eXtreme Gradient Boosting model
    N. Casillas
    A. M. Torres
    M. Moret
    A. Gómez
    J. M. Rius-Peris
    J. Mateo
    Internal and Emergency Medicine, 2022, 17 : 1929 - 1939
  • [9] Mortality predictors in patients with COVID-19 pneumonia: a machine learning approach using eXtreme Gradient Boosting model
    Casillas, N.
    Torres, A. M.
    Moret, M.
    Gomez, A.
    Rius-Peris, J. M.
    Mateo, J.
    INTERNAL AND EMERGENCY MEDICINE, 2022, 17 (07) : 1929 - 1939
  • [10] On using eXtreme Gradient Boosting (XGBoost) Machine Learning algorithm for Home Network Traffic Classification
    Cherif, Iyad Lahsen
    Kortebi, Abdesselem
    2019 WIRELESS DAYS (WD), 2019,