Mining lake time series using symbolic representation

被引:6
|
作者
Ruan, Guangchen [1 ]
Hanson, Paul C. [2 ]
Dugan, Hilary A. [2 ]
Plale, Beth [1 ]
机构
[1] Indiana Univ, Sch Informat & Comp, 919 E 10th St, Bloomington, IN 47408 USA
[2] Univ Wisconsin, Ctr Limnol, 680 North Pk St, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Lake time series; Symbolic representation; Mining; EVOLUTION; MODEL;
D O I
10.1016/j.ecoinf.2017.03.001
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Sensor networks deployed in lakes and reservoirs, when combined with simulation models and expert knowledge from the global community, are creating deeper understanding of the ecological dynamics of lakes. However, the amount of data and the complex patterns in the data demand substantial compute resources and efficient data mining algorithms, both of which are beyond the realm of traditional limnological research. This paper uniquely adapts methods from computer science for application to data intensive ecological questions, in order to provide ecologists with approachable methodology to facilitate knowledge discovery in lake ecology. We apply a state-of-the-art time series mining technique based on symbolic representation (SAX) to high-frequency time series of phycocyanin (PHYCO) and chlorophyll (CHLORO) fluorescence, both of which are indicators of algal biomass in lakes, as well as model predictions of algal biomass (MODEL). We use data mining techniques to demonstrate that MODEL predicts PHYCO better than it predicts CHLORO. All time series have high redundancy, resulting in a relatively small subset of unique patterns. However, MODEL is much less complex than either PHYCO or CHLORO and fails to reproduce high biomass periods indicative of algal blooms. We develop a set of tools in R to enable motif discovery and anomaly detection within a single lake time series, and relationship study among multiple lake time series through distance metrics, clustering and classification. Furthermore, to improve computation times, we provision web services to launch R tools remotely on high performance computing (HPC) resources. Comprehensive experimental results on observational and simulated lake data demonstrate the effectiveness of our approach. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:10 / 22
页数:13
相关论文
共 50 条
  • [1] A symbolic representation of time series
    Wang, Q
    Megalooikonomou, V
    Li, G
    ISSPA 2005: THE 8TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2005, : 655 - 658
  • [2] Symbolic representation for time series
    Combettes, Sylvain W.
    Truong, Charles
    Oudre, Laurent
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 1962 - 1966
  • [3] An Enhanced Binary Symbolic Representation for Time Series Data Mining Based Similarity
    Sun, Meiyu
    Fang, Jianan
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 7130 - 7134
  • [4] A multiresolution symbolic representation of time series
    Megalooikonomou, V
    Wang, Q
    Li, G
    Faloutsos, C
    ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 668 - 679
  • [5] Efficient Time Series Mining using Fractal Representation
    Sajipanon, Poat
    Ratanamahatana, Chotirat Ann
    THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 2, PROCEEDINGS, 2008, : 704 - 709
  • [6] Distance Measure for Symbolic Approximation Representation with Subsequence Direction for Time Series Data Mining
    Li, Tianyu
    Dong, Fang-Yan
    Hirota, Kaoru
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2013, 17 (02) : 263 - 271
  • [7] A new symbolic representation method for time series
    Li, Yucheng
    Shen, Derong
    INFORMATION SCIENCES, 2022, 609 : 276 - 303
  • [8] ABBA-VSM: Time Series Classification Using Symbolic Representation on the Edge
    Kanatbekova, Meerzhan
    Ilager, Shashikant
    Brandic, Ivona
    SERVICE-ORIENTED COMPUTING, ICSOC 2024, PT I, 2025, 15404 : 38 - 53
  • [9] Symbolic Representation of Time Series: A Hierarchical Coclustering Formalization
    Bondu, Alexis
    Boulle, Marc
    Cornuejols, Antoine
    ADVANCED ANALYSIS AND LEARNING ON TEMPORAL DATA, AALTD 2015, 2016, 9785 : 3 - 16
  • [10] Experiencing SAX: a novel symbolic representation of time series
    Lin, Jessica
    Keogh, Eamonn
    Wei, Li
    Lonardi, Stefano
    DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 15 (02) : 107 - 144