Effective Detection of Rare Anomalies from Massive Waveform Data Using Heterogeneous Clustering

被引:1
|
作者
Goto, Masaharu [1 ]
Chikamatsu, Kiyoshi [1 ]
Kobayashi, Naoki [1 ]
Ren, Gang [2 ]
Ogihara, Mitsunori [2 ]
机构
[1] Ctr Excellence Keysight Technol Int Japan, Elect Ind Solut Grp, Hachioji, Tokyo, Japan
[2] Univ Miami, Inst Data Sci & Comp, Dept Comp Sci, Coral Gables, FL 33124 USA
关键词
Clustering; Massive waveform data; Waveform analysis; Real-time signal processing; Long-duration waveform; Measurement instruments;
D O I
10.1109/BigData50022.2020.9377945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today's measurement instruments are capable of capturing and processing massive amount of waveform data. High sampling rate Analog to Digital Converters (ADCs) and low-cost storages make it relatively easy to collect "big measurement data" at massive scale. More and more measurement instrument users acquire tera-byte-scale waveform data which are essential for hard-to-find failure detection and prediction. However, conventional analysis techniques focus on small fragments of signals and largely lag behind today's test and measurement data assets' processing demands. Most of these techniques are inadequate for coping with the massive data volume and the complexities of the analysis tasks. A previous report by the authors introduced a heterogeneous waveform clustering framework to break the technical barriers. The present paper demonstrates the effectiveness of the proposed framework with real-world application examples at tera-byte data scale. The framework consists of the real-time tagging for pre-sorting incoming data, quick clustering for summarizing data overviews from long-duration recording, and detail clustering for deeper analyses. The tagging process is the critical performance link for satisfying the processing time and hardware constrains. We share theoretical analysis on the degree of freedom involved in the waveform and the tagging results. The data is pre-sorted into tag database with highly efficient retrieval characteristics, allowing the system to provide results quickly and flexibly. Three real-world waveform analysis examples are demonstrated, namely power line voltage, mechanical relay stick error, and Bluetooth device current consumption. Our framework allows efficient and robust exploration of complex signal signatures for detecting extremely rare anomalies. The detected anomaly patterns not only show straightforward engineering usages, but also demonstrate a predictive analysis power of related signal events.
引用
收藏
页码:1513 / 1522
页数:10
相关论文
共 50 条
  • [31] A Streaming Clustering Approach Using a Heterogeneous System for Big Data Analysis
    Lee, Dajung
    Althoff, Alric
    Richmond, Dustin
    Kastner, Ryan
    2017 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2017, : 699 - 706
  • [32] Improving Semantic Clustering of EWID Reports by Using Heterogeneous Data Types
    Janusz, Andrzej
    Krasuski, Adam
    Szczuka, Marcin
    ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, 2013, 8170 : 304 - 314
  • [33] Twitter spammer detection using data stream clustering
    Miller, Zachary
    Dickinson, Brian
    Deitrick, William
    Hu, Wei
    Wang, Alex Hai
    INFORMATION SCIENCES, 2014, 260 : 64 - 73
  • [34] Anomaly Detection using Data Clustering and Neural Networks
    Qiu, Hai
    Eklund, Neil
    Hu, Xiao
    Yan, Weizhong
    Iyer, Naresh
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3627 - 3633
  • [35] Detection of Noisy and Corrupted Data Using Clustering Techniques
    Cai, Kui
    Immink, Kees Schouhamer
    PROCEEDINGS OF 2018 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS (ISITA2018), 2018, : 135 - 138
  • [36] Outliers Detection Method Using Clustering in Buildings Data
    Habib, Usman
    Zucker, Gerhard
    Bloechle, Max
    Judex, Florian
    Haase, Jan
    IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 694 - 700
  • [37] Clustering Categorical Data Using Community Detection Techniques
    Huu Hiep Nguyen
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2017, 2017
  • [38] Outlier Detection for Categorial Data Using Clustering Algorithms
    Nowak-Brzezinska, Agnieszka
    Lazarz, Weronika
    COMPUTATIONAL SCIENCE - ICCS 2022, PT III, 2022, 13352 : 714 - 727
  • [39] Anomalies Detection Using Entropy in Household Energy Consumption Data
    Moure-Garrido, Marta
    Campo, Celeste
    Garcia-Rubio, Carlos
    INTELLIGENT ENVIRONMENTS 2020, 2020, 28 : 311 - 320
  • [40] Processing of Big Data in the Detection of Geochemical Anomalies of Rare-Earth Metal Deposits
    Temirbekova, Laura
    INTERNATIONAL CONFERENCE ON ANALYSIS AND APPLIED MATHEMATICS (ICAAM 2018), 2018, 1997