Integrating Multiple Data Sources for Stock Prediction

被引:0
|
作者
Wu, Di [1 ]
Fung, Gabriel Pui Cheong [2 ]
Yu, Jeffrey Xu [1 ]
Liu, Zheng [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Sys Eng & Eng Mgt, Hong Kong, Hong Kong, Peoples R China
[2] Univ Queensland, Sch Info Tech & Elec Engn, Brisbane, Qld 4072, Australia
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real world applications, decisions are usually made by collecting and judging information from multiple different data sources. Let us take the stock market as an example. We never make our decision based on just one single piece of advice, but always rely on a collection of information, such as the stock price movements, exchange volumes, market index, as well as the information from the news articles, expert comments and special announcements (e.g., the increase of stamp duty). Yet, modeling the stock market is difficult because: (1) The process related to market states (up and down) is a stochastic process, which is hard to capture by using the deterministic approach; and (2) The market state is invisible but will be influenced by the visible market information, such as stock prices and news articles. In this paper, we try to model the stock market process by using a Non-homogeneous Hidden Markov Model (NHMM) which takes multiple sources of information into account when making a future prediction. Our model contains three major elements: (1) External event, which denotes the events happening within the stock market (e.g., the drop of US interest rate); (2) Observed market state, which denotes the current market status (e.g. the rise in the stock price); and (3) Hidden market state, which conceptually exists but is invisible to the market participants. Specifically, we model the external events by using the information contained in the news articles, and model the observed market state by using the historical stock prices. Base on these two pieces of observable information and the previous hidden market state, we aim to identify the current hidden market state, so as to predict the immediate market movement. Extensive experiments were conducted to evaluate our work. The encouraging results indicate that our proposed approach is practically sound and effective.
引用
收藏
页码:77 / +
页数:2
相关论文
共 50 条
  • [1] Integrating Multiple Data Sources to Enhance Sentiment Prediction
    Heredia, Brian
    Khoshgoftaar, Taghi M.
    Prusa, Joseph D.
    Crawford, Michael
    [J]. 2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 285 - 291
  • [2] MS-kNN: protein function prediction by integrating multiple data sources
    Lan, Liang
    Djuric, Nemanja
    Guo, Yuhong
    Vucetic, Slobodan
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [3] MS-k NN: protein function prediction by integrating multiple data sources
    Liang Lan
    Nemanja Djuric
    Yuhong Guo
    Slobodan Vucetic
    [J]. BMC Bioinformatics, 14
  • [4] INTEGRATING MULTIPLE BUILT ENVIRONMENT DATA SOURCES
    Won, Jung Yeon
    Elliott, Michael R.
    Sanchez-Vaznaugh, Emma V.
    Sanchez, Brisa N.
    [J]. ANNALS OF APPLIED STATISTICS, 2023, 17 (02): : 1722 - 1739
  • [5] Integrating multiple data sources improves prediction and inference for upland game bird occupancy models
    Emmet, Robert L.
    Benson, Thomas J.
    Allen, Maximilian L.
    Stodola, Kirk W.
    [J]. ORNITHOLOGICAL APPLICATIONS, 2023, 125 (02)
  • [6] Integrating multiple data sources for improved flight delay prediction using explainable machine learning
    Pineda-Jaramillo, Juan
    Munoz, Claudia
    Mesa-Arango, Rodrigo
    Gonzalez-Calderon, Carlos
    Lange, Anne
    [J]. RESEARCH IN TRANSPORTATION BUSINESS AND MANAGEMENT, 2024, 56
  • [7] In Silico Gene Prioritization by Integrating Multiple Data Sources
    Chen, Yixuan
    Wang, Wenhui
    Zhou, Yingyao
    Shields, Robert
    Chanda, Sumit K.
    Elston, Robert C.
    Li, Jing
    [J]. PLOS ONE, 2011, 6 (06):
  • [8] Identifying disease genes by integrating multiple data sources
    Chen, Bolin
    Wang, Jianxin
    Li, Min
    Wu, Fang-Xiang
    [J]. BMC MEDICAL GENOMICS, 2014, 7
  • [9] Accessible Routes Integrating Data from Multiple Sources
    Luaces, Miguel R.
    Fisteus, Jesus A.
    Sanchez-Fernandez, Luis
    Munoz-Organero, Mario
    Balado, Jesus
    Diaz-Vilarino, Lucia
    Lorenzo, Henrique
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (01)
  • [10] Integrating Multiple Data Sources in a Cardiology Imaging Laboratory
    Godinho, Tiago Marques
    Almeida, Eduardo
    Bastido Silva, Luis A.
    Costa, Carlos
    [J]. 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM), 2016, : 596 - 601