Integrating Multiple Data Sources for Stock Prediction

被引:0
|
作者
Wu, Di [1 ]
Fung, Gabriel Pui Cheong [2 ]
Yu, Jeffrey Xu [1 ]
Liu, Zheng [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Sys Eng & Eng Mgt, Hong Kong, Hong Kong, Peoples R China
[2] Univ Queensland, Sch Info Tech & Elec Engn, Brisbane, Qld 4072, Australia
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real world applications, decisions are usually made by collecting and judging information from multiple different data sources. Let us take the stock market as an example. We never make our decision based on just one single piece of advice, but always rely on a collection of information, such as the stock price movements, exchange volumes, market index, as well as the information from the news articles, expert comments and special announcements (e.g., the increase of stamp duty). Yet, modeling the stock market is difficult because: (1) The process related to market states (up and down) is a stochastic process, which is hard to capture by using the deterministic approach; and (2) The market state is invisible but will be influenced by the visible market information, such as stock prices and news articles. In this paper, we try to model the stock market process by using a Non-homogeneous Hidden Markov Model (NHMM) which takes multiple sources of information into account when making a future prediction. Our model contains three major elements: (1) External event, which denotes the events happening within the stock market (e.g., the drop of US interest rate); (2) Observed market state, which denotes the current market status (e.g. the rise in the stock price); and (3) Hidden market state, which conceptually exists but is invisible to the market participants. Specifically, we model the external events by using the information contained in the news articles, and model the observed market state by using the historical stock prices. Base on these two pieces of observable information and the previous hidden market state, we aim to identify the current hidden market state, so as to predict the immediate market movement. Extensive experiments were conducted to evaluate our work. The encouraging results indicate that our proposed approach is practically sound and effective.
引用
收藏
页码:77 / +
页数:2
相关论文
共 50 条
  • [41] A Geo-Temporal Web Gazetteer Integrating Data From Multiple Sources
    Manguinhas, Hugo
    Martins, Bruno
    Borbinha, Jose
    [J]. 2008 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, VOLS 1 AND 2, 2008, : 135 - 142
  • [42] Integrating multiple data sources to fit matrix population models for interacting species
    Barraquand, Frederic
    Gimenez, Olivier
    [J]. ECOLOGICAL MODELLING, 2019, 411
  • [43] INTEGRATING MULTIPLE NUMERIC DATA SOURCES INTO A SINGLE ONLINE SYSTEM - THE ENERGY DATA AND PROJECTIONS SYSTEM
    HANSEN, CE
    TENNANT, WL
    [J]. PROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1983, 20 : 69 - 75
  • [44] Expanding Alternative Splicing Identification by Integrating Multiple Sources of Transcription Data in Tomato
    Clark, Sarah
    Yu, Feng
    Gu, Lianfeng
    Min, Xiang Jia
    [J]. FRONTIERS IN PLANT SCIENCE, 2019, 10
  • [45] Integrating heterogeneous data sources for traffic flow prediction through extreme learning machine
    Zhang, Qingqing
    Jian, Darren
    Xu, Rui
    Dai, Wei
    Liu, Ying
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4189 - 4194
  • [46] A method to improve protein subcellular localization prediction by integrating various biological data sources
    Tung, Thai Quang
    Lee, Doheon
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [47] A method to improve protein subcellular localization prediction by integrating various biological data sources
    Thai Quang Tung
    Doheon Lee
    [J]. BMC Bioinformatics, 10
  • [48] Genome-scale protein function prediction in yeast Saccharomyces cerevisiae through integrating multiple sources of high-throughput data
    Chen, Y
    Xu, D
    [J]. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, 2005, : 471 - 482
  • [49] Gene clustering and gene function prediction using multiple sources of data
    Zare, Hossein
    Khodursky, Arkady B.
    Kaveh, Mostafa
    [J]. 2006 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS, 2006, : 113 - +
  • [50] Vehicle Crashworthiness Performance Prediction Through Fusion of Multiple Data Sources
    Zeng, Jice
    Zhao, Ying
    Li, Guosong
    Gao, Zhenyan
    Li, Yang
    Barbat, Saeed
    Hu, Zhen
    [J]. JOURNAL OF MECHANICAL DESIGN, 2024, 146 (05)