The Predictability of Tree-based Machine Learning Algorithms in the Big Data Context

被引:3
|
作者
Qolipour, F. [1 ]
Ghasemzadeh, M. [1 ]
Mohammad-Karimi, N. [1 ]
机构
[1] Yazd Univ, Dept Comp Engn, Yazd, Iran
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2021年 / 34卷 / 01期
关键词
Stock Market; Big Data; Prediction; Machine Learning; Tree-based Algorithms; Ensemble Algorithms; PRICE; DIRECTION; RETURNS;
D O I
10.5829/ije.2021.34.01a.10
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This research work is concerned with the predictability of ensemble and singular tree-based machine learning algorithms during the recession and prosperity of the two companies listed in the Tehran Stock Exchange in the context of big data. In this regard, the main issue is that economic managers and the academic community require predicting models with more accuracy and reduced execution time; moreover, the prediction of the companies recession in the stock market is highly significant. Machine learning algorithms must be able to appropriately predict the stock return sign during the market downturn and boom days. Addressing the stated challenge will upgrade the quality of stock purchases and, subsequently, will increase profitability. In this article, the proposed solution relies on the utilization of tree-based machine learning algorithms in the context of big data. The proposed solution exploits the decision tree algorithm, which is a traditional and singular tree-based learning algorithm. Furthermore, two modern and ensemble tree-based learning algorithms, random forest and gradient boosted tree, has been utilized for predicting the stock return sign during recession and prosperity. The mentioned cases were implemented by applying the machine learning tools in python programming language and PYSPARK library that is used explicitly for the big data context. The utilized research data of the current study are the shares information of two companies of the Tehran Stock Exchange. The obtained results reveal that the applied ensemble learning algorithms have performed better than the singular learning algorithms. Additionally, adding 23 technical features to the initial data and subsequent applying of the PCA feature reduction method have demonstrated the best performance among other modes. In the meantime, it has been concluded that the initial data do not possess the proper resolution or generalizability, either during prosperity or recession.
引用
收藏
页码:82 / 89
页数:8
相关论文
共 50 条
  • [1] The predictability of tree-based machine learning algorithms in the big data context
    Qolipour F.
    Ghasemzadeh M.
    Mohammad-Karimi N.
    [J]. International Journal of Engineering, Transactions A: Basics, 2021, 34 (01): : 82 - 89
  • [2] Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses
    Manduchi, Elisabetta
    Fu, Weixuan
    Romano, Joseph D.
    Ruberto, Stefano
    Moore, Jason H.
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [3] Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses
    Elisabetta Manduchi
    Weixuan Fu
    Joseph D. Romano
    Stefano Ruberto
    Jason H. Moore
    [J]. BMC Bioinformatics, 21
  • [4] Scaling tree-based automated machine learning to biomedical big data with a feature set selector
    Le, Trang T.
    Fu, Weixuan
    Moore, Jason H.
    [J]. BIOINFORMATICS, 2020, 36 (01) : 250 - 256
  • [5] Determining the Happiness Class of Countries with Tree-Based Algorithms in Machine Learning
    Dogruel, Merve
    Kara, Selin Soner
    [J]. ACTA INFOLOGICA, 2023, 7 (02): : 243 - 252
  • [6] Land subsidence modelling using tree-based machine learning algorithms
    Rahmati, Omid
    Falah, Fatemeh
    Naghibi, Seyed Amir
    Biggs, Trent
    Soltani, Milad
    Deo, Ravinesh C.
    Cerda, Artemi
    Mohammadi, Farnoush
    Dieu Tien Bui
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 672 : 239 - 252
  • [7] Malware Detection Method using Tree-based Machine Learning Algorithms
    Okada, Satoshi
    Matsuda, Wataru
    Fujimoto, Mariko
    Mitsunaga, Takuho
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING (ICOCO), 2021, : 103 - 108
  • [8] Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data
    Uddin, Shahadat
    Lu, Haohui
    [J]. PLOS ONE, 2024, 19 (04):
  • [9] A Comparative Analysis of Tree-based Machine Learning Algorithms for Breast Cancer Detection
    A'la, Fiddin Yusfida
    Permanasari, Adhistya Erna
    Setiawan, Noor Akhmad
    [J]. PROCEEDINGS OF 2019 12TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2019, : 55 - 59
  • [10] Option Return Predictability with Machine Learning and Big Data
    Bali, Turan G.
    Beckmeyer, Heiner
    Morke, Mathis
    Weigert, Florian
    [J]. REVIEW OF FINANCIAL STUDIES, 2023, 36 (09): : 3548 - 3602