BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs

被引:0
|
作者
Liu, Kay [1 ]
Dou, Yingtong [1 ,8 ]
Zhao, Yue [2 ]
Ding, Xueying [2 ]
Hu, Xiyang [2 ]
Zhang, Ruitong [3 ]
Ding, Kaize [4 ]
Chen, Canyu [5 ]
Peng, Hao
Shu, Kai [5 ]
Sun, Lichao [6 ]
Li, Jundong [7 ]
Chen, George H. [2 ]
Jia, Zhihao [2 ]
Yu, Philip S. [1 ]
机构
[1] Univ Illinois, Chicago, IL 60680 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Beihang Univ, Beijing, Peoples R China
[4] Arizona State Univ, Tempe, AZ USA
[5] IIT, Chicago, IL USA
[6] Lehigh Univ, Bethlehem, PA USA
[7] Univ Virginia, Charlottesville, VA USA
[8] Visa Res, Foster City, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Detecting which nodes in graphs are outliers is a relatively new machine learning task with numerous applications. Despite the proliferation of algorithms developed in recent years for this task, there has been no standard comprehensive setting for performance evaluation. Consequently, it has been difficult to understand which methods work well and when under a broad range of settings. To bridge this gap, we present-to the best of our knowledge-the first comprehensive benchmark for unsupervised outlier node detection on static attributed graphs called BOND, with the following highlights. (1) We benchmark the outlier detection performance of 14 methods ranging from classical matrix factorization to the latest graph neural networks. (2) Using nine real datasets, our benchmark assesses how the different detection methods respond to two major types of synthetic outliers and separately to "organic" (real non-synthetic) outliers. (3) Using an existing random graph generation technique, we produce a family of synthetically generated datasets of different graph sizes that enable us to compare the running time and memory usage of the different outlier detection algorithms. Based on our experimental results, we discuss the pros and cons of existing graph outlier detection algorithms, and we highlight opportunities for future research. Importantly, our code is freely available and meant to be easily extendable: https://github.com/pygod-team/pygod/tree/main/benchmark
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Benchmarking Unsupervised Outlier Detection with Realistic Synthetic Data
    Steinbuss, Georg
    Boehm, Klemens
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2021, 15 (04)
  • [2] ORCA: Outlier detection and Robust Clustering for Attributed graphs
    Srinivas Eswar
    Ramakrishnan Kannan
    Richard Vuduc
    Haesun Park
    Journal of Global Optimization, 2021, 81 : 967 - 989
  • [3] Focused Clustering and Outlier Detection in Large Attributed Graphs
    Perozzi, Bryan
    Akoglu, Leman
    Sanchez, Patricia Iglesias
    Mueller, Emmanuel
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1346 - 1355
  • [4] ORCA: Outlier detection and Robust Clustering for Attributed graphs
    Eswar, Srinivas
    Kannan, Ramakrishnan
    Vuduc, Richard
    Park, Haesun
    JOURNAL OF GLOBAL OPTIMIZATION, 2021, 81 (04) : 967 - 989
  • [5] Ranking Outlier Nodes in Subspaces of Attributed Graphs
    Mueller, Emmanuel
    Sanchez, Patricia Iglesias
    Muelle, Yvonne
    Boehm, Klemens
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2013, : 216 - 222
  • [6] Unsupervised feature selection for attributed graphs
    Zhou, Ruizhi
    Niu, Lingfeng
    Yang, Hong
    Expert Systems with Applications, 2021, 168
  • [7] Unsupervised feature selection for attributed graphs
    Zhou, Ruizhi
    Niu, Lingfeng
    Yang, Hong
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [8] Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding
    Bandyopadhyay, Sambaran
    Lokesh, N.
    Vivek, Saley Vishal
    Murty, M. N.
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 25 - 33
  • [9] Benchmarking Conventional Outlier Detection Methods
    Tiukhova, Elena
    Reusens, Manon
    Baesens, Bart
    Snoeck, Monique
    RESEARCH CHALLENGES IN INFORMATION SCIENCE, 2022, 446 : 597 - 613
  • [10] An unsupervised deep learning ensemble model for anomaly detection in static attributed social networks
    Khan W.
    Haroon M.
    International Journal of Cognitive Computing in Engineering, 2022, 3 : 153 - 160