Relative Patterns Discovery toward Big Data Analytics

被引:0
|
作者
Pai, Hao-Ting [1 ]
Wu, Fan [2 ]
Hsueh, Pei-Yun S. [3 ]
Lin, Grace [1 ]
Chan, Ya-Hui [1 ]
机构
[1] III, DATA, Taipei, Taiwan
[2] Natl Chung Cheng Univ, Dept Informat Management, Minhsiung, Taiwan
[3] IBM Thomas J Watson Res Ctr, Hawthorne, NY USA
关键词
Association analysis; big data; data mining; outlier detection;
D O I
10.1109/ICEBE.2015.74
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recently, enterprises and governments invested aggressively in big data analytics because it is truly representative of popular opinion based on millions of people. Despite bringing new opportunities, big data encounters the challenges such as extremely large number of observations (e.g., millions of transactions), high dimensionality (e.g., thousands of items), and immediate response. Taking big data into consideration, the conventional association analysis is frustrated by the extraction of patterns information. Specifically, the computational complexity of frequent itemsets mining increases exponentially by the number of items, which has been proven to be an NP-Complete problem. Although many studies used a pruning-patterns strategy to reduce the complexity, it probably distorts the shape of data and incurs inaccurate result. In this paper, we introduce relative patterns discovery (named RPD) that explores the same patterns between each two observations. To show that RPD is a pragmatic solution toward big data analytics, we design a scalable outlier detection method (named SOD) based on the concept of RPD. Particularly, SOD can score the anomaly without enumerate all the relative patterns. The empirical investigations, conducted with various real-world datasets, demonstrate that SOD performs well even in the environment of large number of observations and high dimensionality.
引用
收藏
页码:401 / 407
页数:7
相关论文
共 50 条
  • [1] Big Data Analytics of Social Networks for the Discovery of "Following" Patterns
    Leung, Carson Kai-Sang
    Jiang, Fan
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, 2015, 9263 : 123 - 135
  • [2] Big data analytics and knowledge discovery
    Bellatreche, Ladjel
    Mohania, Mukesh
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (15): : 3945 - 3947
  • [3] Big Data Analytics and Knowledge Discovery
    Golfarelli, Matteo
    Wrembel, Robert
    [J]. DATA & KNOWLEDGE ENGINEERING, 2023, 146
  • [4] Big Data Analytics for Drug Discovery
    Chan, Keith C. C.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,
  • [5] Toward the development of a big data analytics capability
    Gupta, Manjul
    George, Joey F.
    [J]. INFORMATION & MANAGEMENT, 2016, 53 (08) : 1049 - 1064
  • [6] Ophidia: toward big data analytics for eScience
    Fiore, S.
    D'Anca, A.
    Palazzo, C.
    Foster, I.
    Williams, D. N.
    Aloisio, G.
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 2376 - 2385
  • [7] A Data Science Model for Big Data Analytics of Frequent Patterns
    Leung, Carson K.
    Jiang, Fan
    Zhang, Hao
    Pazdor, Adam G. M.
    [J]. 2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 866 - 873
  • [8] Data analytics and knowledge discovery on big data: Algorithms, architectures, and applications
    Wrembel, Robert
    Gamper, Johann
    [J]. DATA & KNOWLEDGE ENGINEERING, 2024, 150
  • [9] Big data analytics and knowledge discovery for urban computing and intelligence
    Krishna Kant Singh
    Seungmin Rho
    Akansha Singh
    Chernyi Sergei
    [J]. Complex & Intelligent Systems, 2024, 10 : 1 - 2
  • [10] Big data analytics and knowledge discovery for urban computing and intelligence
    Singh, Krishna Kant
    Rho, Seungmin
    Singh, Akansha
    Sergei, Chernyi
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 1 - 2