On the training sample size and classification performance: An experimental evaluation in seismic facies classification

被引:1
|
作者
Babikir, Ismailalwali [1 ]
Elsaadany, Mohamed [1 ]
Sajid, Muhammad [2 ]
Laudon, Carolan [3 ]
机构
[1] Univ Teknol PETRONAS, Ctr Subsurface Imaging, Seri Iskandar, Malaysia
[2] PETRONAS, Kuala Lumpur, Malaysia
[3] Geophys Insights, Houston, TX USA
来源
关键词
Seismic facies classification; Seismic attributes; Supervised machine learning; Model performance; Training sample size;
D O I
10.1016/j.geoen.2023.211809
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
Machine learning algorithms (MLAs) perform better when enough high-quality training data is provided. However, a lack of training data is frequent in seismic facies classification and many other supervised learning applications. Data labeling for seismic facies classification is time-consuming and requires considerable effort from the domain knowledge expert. This study investigates the effect of training data size on the performance of three popular supervised MLAs used for seismic facies classification. We labeled slices from two seismic datasets of diverse geologic environments and varying classification complexity. AN Field in Malay Basin represents a simple classification problem with three classes, whereas a more complex six classes classification is defined in the Dangerous Grounds (DG) dataset offshore Sabah. The labeled data were constantly reduced by half, resulting in eight training subsets of varying sizes. We trained and evaluated support vector machine (SVM), random forest (RF), and neural network (NN) models using a 10-fold cross-validation (CV) procedure. Performance metrics were computed to study the change in performance in response to the training data size. The experimental results show that, for the DG dataset, where the classification is complex due to the heterogeneous geology and a more number of classes, the larger the training subset, the better the classification performance. Nevertheless, for the simple classification scenario of the AN dataset, the classifiers reached a performance plateau when trained on limited samples. We found that the NN model is the best performer on large datasets. The RF classifier performed well in both datasets. It proved to be robust when trained on limited samples of the DG data. The SVM performed the best where there was a clear margin of separation between the defined classes (the AN data). In contrast, it performed poorly on the DG data and exhibited a performance decline on the AN large subsets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A probabilistic approach for seismic facies classification
    Yuan Cheng
    Li Jing-Ye
    Chen Xiao-Hong
    CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2016, 59 (01): : 287 - 298
  • [2] Predicting sample size required for classification performance
    Rosa L Figueroa
    Qing Zeng-Treitler
    Sasikiran Kandula
    Long H Ngo
    BMC Medical Informatics and Decision Making, 12
  • [3] Predicting sample size required for classification performance
    Figueroa, Rosa L.
    Zeng-Treitler, Qing
    Kandula, Sasikiran
    Ngo, Long H.
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
  • [4] On sample size and classification accuracy: A performance comparison
    Sordo, M
    Zeng, Q
    BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS, 2005, 3745 : 193 - 201
  • [5] SENSITIVITY OF HYPERSPECTRAL CLASSIFICATION ALGORITHMS TO TRAINING SAMPLE SIZE
    Lee, Matthew A.
    Prasad, Saurabh
    Bruce, Lori Mann
    West, Terrance R.
    Reynolds, Daniel
    Irby, Trent
    Kalluri, Hemanth
    2009 FIRST WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING, 2009, : 235 - +
  • [6] A deep learning framework for seismic facies classification
    Kaur, Harpreet
    Pham, Nam
    Fomel, Sergey
    Geng, Zhicheng
    Decker, Luke
    Gremillion, Ben
    Jervis, Michael
    Abma, Ray
    Gao, Shuang
    INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2023, 11 (01): : T107 - T116
  • [7] Seismic Signal Interpretation for Reservoir Facies Classification
    Saikia, Pallabi
    Nankani, Deepankar
    Baruah, Rashmi Dutta
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2019, PT II, 2019, 11942 : 409 - 417
  • [8] A comparison of classification techniques for seismic facies recognition
    Zhao, Tao
    Jayaram, Vikram
    Roy, Atish
    Marfurt, Kurt J.
    INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2015, 3 (04): : SAE29 - SAE58
  • [9] Deep learning for automated seismic facies classification
    Tolstaya, Ekaterina
    Egorov, Anton
    INTERPRETATION-A JOURNAL OF SUBSURFACE CHARACTERIZATION, 2022, 10 (02): : SC31 - SC40
  • [10] Performance Evaluation of Machine Learning and Deep Learning Algorithms in Crop Classification: Impact of Hyper-parameters and Training Sample Size
    Kim, Yeseul
    Kwak, Geun-Ho
    Lee, Kyung-Do
    Na, Sang-Il
    Park, Chan-Won
    Park, No-Wook
    KOREAN JOURNAL OF REMOTE SENSING, 2018, 34 (05) : 811 - 827