A Genetic Algorithm for Dynamic Modelling and Prediction of Activity in Document Streams

被引:0
|
作者
Araujo, Lourdes [1 ]
Julian Merelo, Juan [1 ]
机构
[1] Univ Nacl Educ Distancia, Dpto Lenguajes & Sistemas Informat, ETSI Informat, E-28040 Madrid, Spain
关键词
Online text streams; evolutionary algorithms; event stream; modelling; buzz detection;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an evolutionary algorithm for modeling the arrival dates of document streams. which is any time-stamped collection of documents, such as newscasts, e-mails, scientific journals archives and weblog postings. The goal is to find a frequency curve that fits the data circumventing the unavoidable noise. Classical dynamic programming algorithms are limited by memory and efficiency requirements, which can be a problem when dealing with long streams. This suggests to explore alternative search methods which although do not, guarantee optimality, are far more efficient. Experiments have shown that the designed evolutionary algorithm is able to reach high quality solutions in a short time. We have also explored different approaches to infer whether new arrivals increase or decrease interest in the topic the document stream is about. In particular, we present a variant of the evolutionary algorithm, which is able to very quickly fit a stream extended with new data, by taking advantage of the fit obtained for the original substream. These mechanisms can be used for real time detection of changes in the trend of interest in a topic, an important application of this kind of models.
引用
收藏
页码:1896 / 1903
页数:8
相关论文
共 50 条
  • [41] Sampling and feature selection in a genetic algorithm for document clustering
    Casillas, A
    de Lena, MTG
    Martínez, R
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2004, 2945 : 601 - 612
  • [42] A multi-one-class dynamic classifier for adaptive digitization of document streams
    Ho, Anh Khoi Ngo
    Eglin, Veronique
    Ragot, Nicolas
    Ramel, Jean-Yves
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2017, 20 (03) : 137 - 154
  • [43] Genetic algorithm based model for effective document retrieval
    Department of Computer Science, Jamia Hamdard, Hamdard Nagar, New Delhi 110 062, India
    不详
    [J]. Lect. Notes Electr. Eng., (191-201):
  • [44] Genetic algorithm based multi-document summarization
    Liu, Dexi
    He, Yanxiang
    Ji, Donghong
    Yang, Hua
    [J]. PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1140 - 1144
  • [45] Automated Honey Document Generation Using Genetic Algorithm
    Feng, Yun
    Liu, Baoxu
    Zhang, Yue
    Zhang, Jinli
    Liu, Chaoge
    Liu, Qixu
    [J]. WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT III, 2021, 12939 : 20 - 28
  • [46] Enhanced Genetic Algorithm for Single Document Extractive Summarization
    Bui Thi Mai Anh
    Nguyen Tra My
    Nguyen Thi Thu Trang
    [J]. SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 370 - 376
  • [47] A genetic algorithm approach to automated custom document assembly
    Purvis, L
    [J]. COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2002, : 131 - 136
  • [48] Hierarchical Star Clustering Algorithm for Dynamic Document Collections
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2008, 5197 : 187 - 194
  • [49] Mining Frequent Itemsets in Data Streams Based on Genetic Algorithm
    Han, Chong
    Sun, Lijuan
    Guo, Jian
    Chen, Xiaodong
    [J]. 2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 748 - 753
  • [50] Mining data streams with concept drifts using genetic algorithm
    Periasamy Vivekanandan
    Raju Nedunchezhian
    [J]. Artificial Intelligence Review, 2011, 36 : 163 - 178