Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data

被引:5
|
作者
Sabeti, Elyas [1 ]
Host-Madsen, Anders [2 ,3 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, NCRC 10-A108,2800 Plymouth Rd, Ann Arbor, MI 48109 USA
[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA
[3] Shenzhen Res Inst Big Data, Shenzhen 518172, Peoples R China
关键词
atypicality; minimum description length; big data; codelength; TRANSIENT; PERFORMANCE; COMPRESSION;
D O I
10.3390/e21030219
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such "interesting" parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we developed the methodology for discrete-valued data, and the current paper extends this to real-valued data. This is done by using minimum description length (MDL). We develop the information-theoretic methodology for a number of "universal" signal processing models, and finally apply them to recorded hydrophone data and heart rate variability (HRV) signal.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Data Discovery and Anomaly Detection Using Atypicality: Theory
    Host-Madsen, Anders
    Sabeti, Elyas
    Walton, Chad
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (09) : 5302 - 5322
  • [2] Anomaly Detection Using Real-Valued Negative Selection
    Fabio A. González
    Dipankar Dasgupta
    [J]. Genetic Programming and Evolvable Machines, 2003, 4 (4) : 383 - 403
  • [3] Anomaly detection using real-valued negative selection
    Zhang, FB
    Yang, YT
    Wang, SW
    [J]. ISTM/2005: 6TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-9, CONFERENCE PROCEEDINGS, 2005, : 949 - 952
  • [4] Universal Data Discovery Using Atypicality
    Host-Madsen, Anders
    Sabeti, Elyas
    Walton, Chad
    Lim, Su Jun
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3474 - 3483
  • [5] Optimization of Real-Valued Self Set for Anomaly Detection Using Gaussian Distribution
    Xi, Liang
    Zhang, Fengbin
    Wang, Dawei
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PROCEEDINGS, 2009, 5855 : 112 - 120
  • [6] Outlier detection for incomplete real-valued data based on inner boundary
    Zhao, Zhengwei
    Yang, Genteng
    Li, Zhaowen
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 3023 - 3041
  • [7] Atypical Information Theory for Real-Valued Data
    Host-Madsen, Anders
    Sabeti, Elyas
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2015, : 666 - 670
  • [8] Subspace fitting approaches for frequency estimation using real-valued data
    Mahata, K
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (08) : 3099 - 3110
  • [9] QUANTITATIVE STABILITY ANALYSIS USING REAL-VALUED FREQUENCY RESPONSE DATA
    Schmid, Martin
    Blumenthal, Ralf S.
    Schulze, Moritz
    Polifke, Wolfgang
    Sattelmayer, Thomas
    [J]. PROCEEDINGS OF THE ASME TURBO EXPO: TURBINE TECHNICAL CONFERENCE AND EXPOSITION, 2013, VOL 1B, 2013,
  • [10] Quantitative Stability Analysis Using Real-Valued Frequency Response Data
    Schmid, Martin
    Blumenthal, Ralf S.
    Schulze, Moritz
    Polifke, Wolfgang
    Sattelmayer, Thomas
    [J]. JOURNAL OF ENGINEERING FOR GAS TURBINES AND POWER-TRANSACTIONS OF THE ASME, 2013, 135 (12):