Towards Automated Lithology Classification in NATM Tunnel: A Data-Driven Solution for Multi-dimensional Imbalanced Data

被引:0
|
作者
Li, Yang [1 ,2 ]
Chen, Jiayao [1 ,2 ,4 ]
Fang, Qian [1 ,2 ]
Zhang, Dingli [1 ,2 ]
Huang, Wengui [3 ]
机构
[1] Beijing Jiaotong Univ, Sch Civil Engn, Beijing 100044, Peoples R China
[2] Beijing Jiaotong Univ, Key Lab Urban Underground Engn, Minist Educ, Beijing 100044, Peoples R China
[3] Teesside Univ, Sch Comp Engn & Digital Technol, Middlesbrough TS1 3BA, England
[4] East China Jiaotong Univ, State Key Lab Performance Monitoring & Protecting, Nanchang, Jiangxi, Peoples R China
关键词
New Austrian tunneling method; Measurement-while-drilling; Lithology classification; Machine learning; Multi-dimensional imbalanced data; ROCK STRENGTH PARAMETERS; RANDOM FORESTS; PREDICTION; SYSTEM; RECOGNITION; TECHNOLOGY; TESTS; MODEL; INDEX;
D O I
10.1007/s00603-024-04287-6
中图分类号
P5 [地质学];
学科分类号
0709 ; 081803 ;
摘要
To fully grasp the lithology of unexcavated tunnel geology, a correlation database using measurement-while-drilling (MWD) information from the NATM tunnel excavation process was established, resulting in a multi-dimensional imbalanced dataset consisting of 7216 entries. By integrating borehole imaging and expert interpretation, drilling parameters were aligned with lithology data. A hybrid ensemble model, combining adaptive synthetic sampling (ADASYN), grid search (GS) hyperparameter optimization, and eXtreme gradient boosting (XGBoost), is proposed for intelligent lithology classification. Various machine learning models, incorporating hyperparameter optimization and oversampling algorithms, were employed, cumulatively generating 12 classifiers for Macro F1 performance comparison. Comprehensive analysis showed that the GS-ADASYN-XGBoost algorithm outperformed the other hybrid models in classifying different lithologies. Water pressure was identified as the key feature influencing lithology classification, followed by water flow. Setting the oversampling proportion to 0.2, the ADASYN method effectively optimized the data imbalance ratio, significantly enhancing classifier performance. This improvement was most notable for the least represented lithology category, chlorite, with an increase of 1.27 times compared to no oversampling. The proposed model provides valuable insights for geological interpretation of the tunnel face. A hybrid GS-ADASYN-XGBoost model is proposed for classifying lithologies.A database with 7216 MWD from NATM tunnel excavation is established.Borehole imaging and expert interpretation align drilling parameters with lithology.Multi-dimensional data imbalance is effectively optimized by ADASYN.
引用
收藏
页码:2349 / 2366
页数:18
相关论文
共 50 条
  • [1] Data-Driven Insight Synthesis for Multi-Dimensional Data
    Xing, Junjie
    Wang, Xinyu
    Jagadish, H. V.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (05): : 1007 - 1019
  • [2] A Data-driven Fuzzy Modelling Framework for the Classification of Imbalanced Data
    Rubio-Solis, Adrian
    Panoutsos, George
    Thornton, Steve
    2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 302 - 307
  • [3] Data-Driven Partial Differential Equations Discovery Approach for the Noised Multi-dimensional Data
    Maslyaev, Mikhail
    Hvatov, Alexander
    Kalyuzhnaya, Anna
    COMPUTATIONAL SCIENCE - ICCS 2020, PT II, 2020, 12138 : 86 - 100
  • [4] Multi-dimensional Education: A Common Sense Approach to Data-Driven Thinking
    Kushner Benson, Susan N.
    JOURNAL OF EDUCATIONAL RESEARCH, 2013, 106 (02): : 170 - 171
  • [5] Multi-dimensional Data-driven Mobile Edge Caching with Dynamic User Preference
    Liu, Mengge
    Li, Dapeng
    Zhao, Haitao
    Wang, Xiaoming
    Jiang, Rui
    2020 12TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2020, : 580 - 585
  • [6] A novel data-driven integrated detection method for network intrusion classification based on multi-feature imbalanced data
    Wang, Chia-Hung
    Ye, Qing
    Cai, Jiongbiao
    Suo, Yifan
    Lin, Shengming
    Yuan, Jinchen
    Wu, Xiaojing
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (03) : 5893 - 5910
  • [7] Data-Driven Forecasting of Nonlinear System with Herding via Multi-Dimensional Taylor Network
    Yan, Hong-Sen
    Wang, Guo-Biao
    Zhou, Bo
    Wan, Xiao-Qin
    Zhang, Jiao-Jun
    CYBERNETICS AND SYSTEMS, 2024, 55 (04) : 981 - 1004
  • [8] Real-time Data Fusion Platforms: The Need of Multi-dimensional Data-driven Research in Biomedical Informatics
    Raje, Satyajeet
    Kite, Bobbie
    Ramanathan, Jay
    Payne, Philip
    MEDINFO 2015: EHEALTH-ENABLED HEALTH, 2015, 216 : 1107 - 1107
  • [9] Validating Data-Driven Approaches Towards Dimensional Phenotypes
    Eickhoff, Simon
    BIOLOGICAL PSYCHIATRY, 2020, 87 (09) : S27 - S27
  • [10] Data-driven decomposition for multi-class classification
    Zhou, Jie
    Peng, Hanchuan
    Suen, Ching Y.
    PATTERN RECOGNITION, 2008, 41 (01) : 67 - 76