Effective Detection of Rare Anomalies from Massive Waveform Data Using Heterogeneous Clustering

被引:1
|
作者
Goto, Masaharu [1 ]
Chikamatsu, Kiyoshi [1 ]
Kobayashi, Naoki [1 ]
Ren, Gang [2 ]
Ogihara, Mitsunori [2 ]
机构
[1] Ctr Excellence Keysight Technol Int Japan, Elect Ind Solut Grp, Hachioji, Tokyo, Japan
[2] Univ Miami, Inst Data Sci & Comp, Dept Comp Sci, Coral Gables, FL 33124 USA
关键词
Clustering; Massive waveform data; Waveform analysis; Real-time signal processing; Long-duration waveform; Measurement instruments;
D O I
10.1109/BigData50022.2020.9377945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today's measurement instruments are capable of capturing and processing massive amount of waveform data. High sampling rate Analog to Digital Converters (ADCs) and low-cost storages make it relatively easy to collect "big measurement data" at massive scale. More and more measurement instrument users acquire tera-byte-scale waveform data which are essential for hard-to-find failure detection and prediction. However, conventional analysis techniques focus on small fragments of signals and largely lag behind today's test and measurement data assets' processing demands. Most of these techniques are inadequate for coping with the massive data volume and the complexities of the analysis tasks. A previous report by the authors introduced a heterogeneous waveform clustering framework to break the technical barriers. The present paper demonstrates the effectiveness of the proposed framework with real-world application examples at tera-byte data scale. The framework consists of the real-time tagging for pre-sorting incoming data, quick clustering for summarizing data overviews from long-duration recording, and detail clustering for deeper analyses. The tagging process is the critical performance link for satisfying the processing time and hardware constrains. We share theoretical analysis on the degree of freedom involved in the waveform and the tagging results. The data is pre-sorted into tag database with highly efficient retrieval characteristics, allowing the system to provide results quickly and flexibly. Three real-world waveform analysis examples are demonstrated, namely power line voltage, mechanical relay stick error, and Bluetooth device current consumption. Our framework allows efficient and robust exploration of complex signal signatures for detecting extremely rare anomalies. The detected anomaly patterns not only show straightforward engineering usages, but also demonstrate a predictive analysis power of related signal events.
引用
收藏
页码:1513 / 1522
页数:10
相关论文
共 50 条
  • [1] Clustering Heterogeneous Data Using Clustering by Compression
    Cernian, Alexandra
    Carstoiu, Dorin
    PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 133 - +
  • [2] An Effective Clustering Approach with Data Aggregation Using Multiple Mobile Sinks for Heterogeneous WSN
    A. Muthu Krishnan
    P. Ganesh Kumar
    Wireless Personal Communications, 2016, 90 : 423 - 434
  • [3] An Effective Clustering Approach with Data Aggregation Using Multiple Mobile Sinks for Heterogeneous WSN
    Krishnan, A. Muthu
    Kumar, P. Ganesh
    WIRELESS PERSONAL COMMUNICATIONS, 2016, 90 (02) : 423 - 434
  • [4] Cardiovascular abnormality detection method using cardiac sound characteristic waveform with data clustering technique
    Choi, Samjin
    Jiang, Zhongwei
    Kim, Il-Hwan
    Park, Chan-Won
    2007 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS, VOLS 1-6, 2007, : 579 - +
  • [5] Evidence Identification in Heterogeneous Data Using Clustering
    Mohammed, Hussam
    Clarke, Nathan
    Li, Fudong
    13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [6] A model for clustering data from heterogeneous dissimilarities
    Santi, Everton
    Aloise, Daniel
    Blanchard, Simon J.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2016, 253 (03) : 659 - 672
  • [7] Scaling clustering algorithms for massive data sets using data streams
    Nittel, S
    Leung, KT
    Braverman, A
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 830 - 830
  • [8] Detection of Anomalies in Daily Activities Using Data from Smart Meters
    Hernandez, Alvaro
    Nieto, Ruben
    de Diego-Oton, Laura
    Carmen Perez-Rubio, Maria
    Villadangos-Carrizo, Jose M.
    Pizarro, Daniel
    Urena, Jesus
    SENSORS, 2024, 24 (02)
  • [9] ERROR CONTROL FOR THE DETECTION OF RARE AND WEAK SIGNATURES IN MASSIVE DATA
    Meillier, Celine
    Chatelain, Florent
    Michel, Olivier
    Ayasso, Hacheme
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1974 - 1978
  • [10] Clustering Heterogeneous Web Data Using Clustering by Compression. Cluster Validity
    Cernian, Alexandra
    Carstoiu, Dorin
    Olteanu, Adriana
    PROCEEDINGS OF THE 10TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, 2009, : 123 - 126