Product Length Predictions with Machine Learning: An Integrated Approach Using Extreme Gradient Boosting

被引:0
|
作者
Thakur A. [1 ]
Kumar A. [1 ]
Mishra S.K. [1 ]
Behera S.K. [2 ]
Sethi J. [3 ]
Sahu S.S. [4 ]
Swain S.K. [1 ]
机构
[1] Department of Electrical and Electronics Engineering, Birla Institute of Technology Mesra, Ranchi
[2] Department of Electronics and Telecommunication Engineering, DRIEMS Autonomous Engineering College, Tangi, Odisha, Cuttack
[3] Department of Electronics and Instrumentation Engineering, Odisha University of Technology and Research, Techno Campus, Ghatikia, Odisha, Bhubaneswar
[4] Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, Jharkhand, Ranchi
关键词
Interquartile Range (IQR); Natural Language Processing (NLP); Predictive Modeling; Term Frequency-Inverse Document Frequency (TF-IDF); XGBoost;
D O I
10.1007/s42979-024-02999-8
中图分类号
学科分类号
摘要
The study aims to introduce a novel machine learning approach for the prediction of product lengths by addressing diverse data types including numeric, textual and categorical data and extracting valuable information from the dataset to enhance prediction accuracy. This is achieved by employing methods that combine text vectorization, gradient boosting algorithm and feature encoding of categorical data, specifically using Term Frequency-Inverse Document Frequency (TF-IDF), eXtreme Gradient Boosting (XGBoost) and target encoding. Our method begins with thorough data preparation, removing outliers and filling in missing values, then extracts important features from product titles, descriptions, and bullet points present in the dataset. We convert text from product titles, descriptions, and bullet points into numerical form using the TF-IDF technique. It captures the weighted frequency of words in the form of TF-IDF feature vectors enabling the effective application of the algorithm. Our training process employs RandomizedSearchCV to optimize the XGBoost model’s hyperparameters utilizing TF-IDF vectors and target encoded product type IDs. This allows the model to effectively handle variability and uncertainty for product length predictions. The techniques used contribute to the adaptability of the method and enable accurate prediction of product length in e-commerce which can be helpful in inventory management across diverse products. This can extend their utility to optimize supply chain operations, improving demand forecasting across a variety of products, and aiding in strategic planning for procurement and stock levels. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.
引用
收藏
相关论文
共 50 条
  • [41] Stacked ensemble deep learning for pancreas cancer classification using extreme gradient boosting
    Bakasa, Wilson
    Viriri, Serestina
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [42] CitrusDiseaseNet: An integrated approach for automated citrus disease detection using deep learning and kernel extreme learning machine
    Sankaran, Shanmugapriya
    Subbiah, Dhanasekaran
    Chokkalingam, Bala Subramanian
    EARTH SCIENCE INFORMATICS, 2024, 17 (04) : 3053 - 3070
  • [43] Developing and Preliminary Testing of a Machine Learning-Based Platform for Sales Forecasting Using a Gradient Boosting Approach
    Panarese, Antonio
    Settanni, Giuseppina
    Vitti, Valeria
    Galiano, Angelo
    APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [44] Regularized Training of the Extreme Learning Machine using the Conjugate Gradient Method
    de Chazal, Philip
    McDonnell, Mark D.
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1802 - 1808
  • [45] An Adjective Selection Personality Assessment Method Using Gradient Boosting Machine Learning
    Fernandes, Bruno
    Gonzalez-Briones, Alfonso
    Novais, Paulo
    Calafate, Miguel
    Analide, Cesar
    Neves, Jose
    PROCESSES, 2020, 8 (05)
  • [46] A Novel Algorithm to Estimate the Significance Level of a Feature Interaction Using the Extreme Gradient Boosting Machine
    Guo, Chao-Yu
    Chang, Ke-Hao
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (04)
  • [47] Development of a predictive emissions model using a gradient boosting machine learning method
    Si, Minxing
    Du, Ke
    ENVIRONMENTAL TECHNOLOGY & INNOVATION, 2020, 20
  • [48] An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method
    Gumaei, Abdu
    Al-Rakhami, Mabrook S.
    Hassan, Mohammad Mehedi
    De Albuquerque, Victor Hugo C.
    Camacho, David
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (01)
  • [49] The boosting approach to machine learning: An overview
    Schapire, RE
    NONLINEAR ESTIMATION AND CLASSIFICATION, 2003, 171 : 149 - 171
  • [50] A transfer learning approach based on gradient boosting machine for diagnosis of Alzheimer's disease
    Shojaie, Mehdi
    Cabrerizo, Mercedes
    DeKosky, Steven T.
    Vaillancourt, David E.
    Loewenstein, David
    Duara, Ranjan
    Adjouadi, Malek
    FRONTIERS IN AGING NEUROSCIENCE, 2022, 14