Matrix Profile II: Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins

被引:0
|
作者
Zhu, Yan [1 ]
Zimmerman, Zachary [1 ]
Shakibay Senobari, Nader [2 ]
Yeh, Chin-Chia Michael [1 ]
Funning, Gareth [2 ]
Mueen, Abdullah [3 ]
Brisk, Philip [1 ]
Keogh, Eamonn [1 ]
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
[2] Univ Calif Riverside, Dept Earth Sci, Riverside, CA 92521 USA
[3] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
基金
美国国家科学基金会;
关键词
Time series; joins; motifs; GPUs;
D O I
10.1109/ICDM.2016.126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time series motifs have been in the literature for about fifteen years, but have only recently begun to receive significant attention in the research community. This is perhaps due to the growing realization that they implicitly offer solutions to a host of time series problems, including rule discovery, anomaly detection, density estimation, semantic segmentation, etc. Recent work has improved the scalability to the point where exact motifs can be computed on datasets with up to a million data points in tenable time. However, in some domains, for example seismology, there is an insatiable need to address even larger datasets. In this work we show that a combination of a novel algorithm and a high-performance GPU allows us to significantly improve the scalability of motif discovery. We demonstrate the scalability of our ideas by finding the full set of exact motifs on a dataset with one hundred million subsequences, by far the largest dataset ever mined for time series motifs. Furthermore, we demonstrate that our algorithm can produce actionable insights in seismology and other domains.
引用
收藏
页码:739 / 748
页数:10
相关论文
共 9 条
  • [1] Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joins
    Yan Zhu
    Zachary Zimmerman
    Nader Shakibay Senobari
    Chin-Chia Michael Yeh
    Gareth Funning
    Abdullah Mueen
    Philip Brisk
    Eamonn Keogh
    Knowledge and Information Systems, 2018, 54 : 203 - 236
  • [2] Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joins
    Zhu, Yan
    Zimmerman, Zachary
    Shakibay Senobari, Nader
    Yeh, Chin-Chia Michael
    Funning, Gareth
    Mueen, Abdullah
    Brisk, Philip
    Keogh, Eamonn
    KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 54 (01) : 203 - 236
  • [3] Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile
    Chin-Chia Michael Yeh
    Yan Zhu
    Liudmila Ulanova
    Nurjahan Begum
    Yifei Ding
    Hoang Anh Dau
    Zachary Zimmerman
    Diego Furtado Silva
    Abdullah Mueen
    Eamonn Keogh
    Data Mining and Knowledge Discovery, 2018, 32 : 83 - 123
  • [4] Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile
    Yeh, Chin-Chia Michael
    Zhu, Yan
    Ulanova, Liudmila
    Begum, Nurjahan
    Ding, Yifei
    Dau, Hoang Anh
    Zimmerman, Zachary
    Silva, Diego Furtado
    Mueen, Abdullah
    Keogh, Eamonn
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (01) : 83 - 123
  • [5] Exploring variable-length time series motifs in one hundred million length scale
    Yifeng Gao
    Jessica Lin
    Data Mining and Knowledge Discovery, 2018, 32 : 1200 - 1228
  • [6] Matrix Profile XV: Exploiting Time Series Consensus Motifs to Find Structure in Time Series Sets
    Kamgar, Kaveh
    Gharghabi, Shaghayegh
    Keogh, Eamonn
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1156 - 1161
  • [7] Exploring variable-length time series motifs in one hundred million length scale
    Gao, Yifeng
    Lin, Jessica
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (05) : 1200 - 1228
  • [8] Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets
    Michael Yeh, Chin-Chia
    Zhu, Yan
    Ulanova, Liudmila
    Begum, Nurjahan
    Ding, Yifei
    Hoang Anh Dau
    Silva, Diego Furtado
    Mueen, Abdullah
    Keogh, Eamonn
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 1317 - 1322
  • [9] Matrix Profile XIV: Scaling Time Series Motif Discovery with GPUs to Break a Quintillion Pairwise Comparisons a Day and Beyond
    Zimmerman, Zachary
    Kamgar, Kaveh
    Senobari, Nader Shakibay
    Crites, Brian
    Funning, Gareth
    Brisk, Philip
    Keogh, Eamonn
    PROCEEDINGS OF THE 2019 TENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '19), 2019, : 74 - 86