A Genetic Algorithm for Dynamic Modelling and Prediction of Activity in Document Streams

被引:0
|
作者
Araujo, Lourdes [1 ]
Julian Merelo, Juan [1 ]
机构
[1] Univ Nacl Educ Distancia, Dpto Lenguajes & Sistemas Informat, ETSI Informat, E-28040 Madrid, Spain
关键词
Online text streams; evolutionary algorithms; event stream; modelling; buzz detection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an evolutionary algorithm for modeling the arrival dates of document streams. which is any time-stamped collection of documents, such as newscasts, e-mails, scientific journals archives and weblog postings. The goal is to find a frequency curve that fits the data circumventing the unavoidable noise. Classical dynamic programming algorithms are limited by memory and efficiency requirements, which can be a problem when dealing with long streams. This suggests to explore alternative search methods which although do not, guarantee optimality, are far more efficient. Experiments have shown that the designed evolutionary algorithm is able to reach high quality solutions in a short time. We have also explored different approaches to infer whether new arrivals increase or decrease interest in the topic the document stream is about. In particular, we present a variant of the evolutionary algorithm, which is able to very quickly fit a stream extended with new data, by taking advantage of the fit obtained for the original substream. These mechanisms can be used for real time detection of changes in the trend of interest in a topic, an important application of this kind of models.
引用
收藏
页码:1896 / 1903
页数:8
相关论文
共 50 条
  • [1] SFFS-PC-NN optimized by genetic algorithm for dynamic prediction of financial distress with longitudinal data streams
    Sun, Jie
    He, Kai-Yu
    Li, Hui
    [J]. KNOWLEDGE-BASED SYSTEMS, 2011, 24 (07) : 1013 - 1023
  • [2] Genetic algorithm for burst detection and activity tracking in event streams
    Araujo, Lourdes
    Cuesta, Jose A.
    Merelo, Juan J.
    [J]. PARALLEL PROBLEM SOLVING FROM NATURE - PPSN IX, PROCEEDINGS, 2006, 4193 : 302 - 311
  • [3] An efficient algorithm for modelling and dynamic prediction of network traffic
    Fan, Wenjie
    Zhang, Hong
    Li, Kuan-Ching
    Zhang, Shunxiang
    Marino, Mario Donato
    Jiang, Hai
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2018, 16 (03) : 311 - 320
  • [4] A Gradient-based Algorithm for trend and outlier prediction in dynamic data streams
    Sun, Dawei
    Lee, Vincent C. S.
    Lu, Ye
    [J]. PROCEEDINGS OF THE 2017 12TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2017, : 1978 - 1983
  • [5] Genetic functions-based modelling for pier scour depth prediction in coarse bed streams
    Khan, Mujahid
    Tufail, Mohammad
    Azamathulla, Hazi Md.
    Ahmad, Irshad
    Muhammad, Noor
    [J]. PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-WATER MANAGEMENT, 2018, 171 (05) : 225 - 240
  • [6] Application of Genetic Algorithm in Document Clustering
    Wei Jian-Xiang
    Liu Huai
    Sun Yue-hong
    Su Xin-Ning
    [J]. 2009 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE, VOL 1, PROCEEDINGS, 2009, : 145 - +
  • [7] Parallel Dynamic Data Driven Genetic Algorithm for Forest Fire Prediction
    Denham, Monica
    Cortes, Ana
    Margalef, Tomas
    [J]. RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2009, 5759 : 323 - 324
  • [8] Improved CHAID Algorithm for Document Structure Modelling
    Belaid, A.
    Moinel, T.
    Rangoni, Y.
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL XVII, 2010, 7534
  • [9] Multi-scale genetic dynamic modelling I: an algorithm to compute generators
    Kirkilionis, Markus
    Janus, Ulrich
    Sbano, Luca
    [J]. THEORY IN BIOSCIENCES, 2011, 130 (03) : 165 - 182
  • [10] Multi-scale genetic dynamic modelling I : an algorithm to compute generators
    Markus Kirkilionis
    Ulrich Janus
    Luca Sbano
    [J]. Theory in Biosciences, 2011, 130