BF-BigGraph: An efficient subgraph isomorphism approach using machine learning for big graph databases

被引:0
|
作者
Yazici, Adnan [1 ,2 ]
Taskomaz, Ezgi [2 ]
机构
[1] Nazarbayev Univ, Sch Engn & Digital Sci, Dept Comp Sci, Astana, Kazakhstan
[2] Middle East Tech Univ, Dept Comp Engn, Ankara, Turkiye
关键词
Graph-based NoSQL databases; Machine learning; Subgraph isomorphism; QUERY OPTIMIZATION; ALGORITHM;
D O I
10.1016/j.is.2024.102401
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graph databases are flexible NoSQL databases used to efficiently store and query complex and big data. One of the most difficult problems in graph databases is the problem of subgraph isomorphism, which involves finding a matching pattern in a given graph. Subgraph isomorphism algorithms generally encounter problems in the efficient processing of complex queries based on a lack of pruning methods and the use of a matching order. In this study, we present a new subgraph isomorphism approach based on the best-first search design strategy and name it BF-BigGraph. Our approach includes a machine learning technique to efficiently find the best matching order for various complex queries. The parameters we used in our approach as heuristics to improve the performance of complex queries on graph-based NoSQL databases are database volatility, database size, type of query, and the size of the query. We utilized the Random Forest machine learning method to narrow candidate nodes to a higher level of search and effectively reduce the search space for efficient querying and retrieval. We compared BF-BigGraph with state-of-the-art approaches, namely BB-Graph, Neo4j's Cypher, DualIso, GraphQL, TurboIso, and VF3 using publicly available databases including undirected graphs; WorldCup, Pokec, Youtube, and a big graph database of a real demographic application (a population database) with approximately 70 million nodes of a big directed graph. The performance results of our approach for different types of complex queries on all these databases are significantly better in terms of computation time and required memory than other competing approaches in the literature.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] An Efficient Model to Decipher the Electroencephalogram Signals Using Machine Learning Approach
    Gupta, N.
    Gupta, S.
    Khare, V.
    Jain, C. K.
    Akhter, S.
    4TH KUALA LUMPUR INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING 2008, VOLS 1 AND 2, 2008, 21 (1-2): : 782 - +
  • [22] A machine learning approach for efficient uncertainty quantification using multiscale methods
    Chan, Shing
    Elsheikh, Ahmed H.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2018, 354 : 493 - 511
  • [23] An Efficient Approach for Interpretation of Indian Sign Language using Machine Learning
    Dhivyasri, S.
    Hari, Krishnaa K. B.
    Akash, M.
    Sona, M.
    Divyapriya, S.
    Krishnaveni, V
    ICSPC'21: 2021 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICPSC), 2021, : 130 - 133
  • [24] An Efficient Underwater Pipeline Detection System Using Machine Learning Approach
    Sravya, N.
    Balakrishnan, Arun A.
    Supriya, M. H.
    PROCEEDINGS OF THE 2019 INTERNATIONAL SYMPOSIUM ON OCEAN TECHNOLOGY (SYMPOL 2019), 2019, : 181 - 190
  • [25] Effective and efficient optics inspection approach using machine learning algorithms
    Abdulla, Ghaleb M.
    Kegelmeyer, Laura Mascio
    Liao, Zhi M.
    Carr, Wren
    LASER-INDUCED DAMAGE IN OPTICAL MATERIALS: 2010, 2010, 7842
  • [26] An efficient approach to detect IoT botnet attacks using machine learning
    Alothman, Zainab
    Alkasassbeh, Mouhammd
    Baddar, Sherenaz Al-Haj
    JOURNAL OF HIGH SPEED NETWORKS, 2020, 26 (03) : 241 - 254
  • [27] An Efficient Approach for Detecting Helmets on Motorcyclists Using Machine Learning Techniques
    Talaulikar, Abhijeet S.
    Sanathanan, Sanjay
    Modi, Chirag N.
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 3, 2018, 709 : 437 - 444
  • [28] Scalable malware detection system using big data and distributed machine learning approach
    Manish Kumar
    Soft Computing, 2022, 26 : 3987 - 4003
  • [29] A Two Step Unsupervised Learning Approach to Diagnose Machine Fault Using Big Data
    Sharmila, V. J.
    Florinabel, D. Jemi
    INFORMATION TECHNOLOGY AND CONTROL, 2022, 51 (01): : 78 - 85
  • [30] Scalable malware detection system using big data and distributed machine learning approach
    Kumar, Manish
    SOFT COMPUTING, 2022, 26 (08) : 3987 - 4003