A MapReduce-Based Approach for Mining Embedded Patterns from Large Tree Data

被引:0
|
作者
Zhao, Wen [1 ]
Wu, Xiaoying [1 ]
机构
[1] Wuhan Univ, Comp Sch, Wuhan, Hubei, Peoples R China
来源
关键词
Tree pattern; MapReduce; Holistic twig-join algorithm;
D O I
10.1007/978-3-319-96893-3_34
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding tree patterns hidden in large datasets is an important research area that has many practical applications. Unfortunately, previous contributions have focused almost exclusively on extracting patterns from a set of small trees on a centralized machine. The problem of mining embedded patterns from large data trees has been neglected. However, this pattern mining problem is also important for many modern applications that arise naturally and in particular with the explosion of big data. In this paper, we propose a novel MapReduce approach to mine embedded patterns from a single large tree which can handle situations when either the tree itself or intermediate mining results at low frequency thresholds cannot fit in the memory of any individual computer node. Furthermore, we come up with a set of optimizations to minimize internode communication. Experimental evaluation shows that our algorithm can scale well to trees with over ten million vertices.
引用
收藏
页码:455 / 462
页数:8
相关论文
共 50 条
  • [1] A MapReduce-based Approach for Finding Inexact Patterns in Large Graphs
    Feher, Peter
    Asztalos, Mark
    Meszaros, Tamas
    Lengyel, Laszlo
    [J]. MODELSWARD 2015 PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON MODEL-DRIVEN ENGINEERING AND SOFTWARE DEVELOPMENT, 2015, : 205 - 212
  • [2] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [3] From Homomorphisms to Embeddings: A Novel Approach for Mining Embedded Patterns from Large Tree Data
    Wu, Xiaoying
    Theodoratos, Dimitri
    Sellis, Timos
    [J]. BIG DATA RESEARCH, 2018, 14 : 37 - 53
  • [4] A MapReduce-based Quick Search Approach on Large Files
    Li, Ye-feng
    Le, Jia-jin
    Wang, Mei
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2019, 16 (05) : 791 - 797
  • [5] A Novel Approach for Mining Patterns from Large Uncertain Data using MapReduce Model
    Rathan, B. Rini
    Rani, K. Swarupa
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2017,
  • [6] MapReduce-based Parallelized Approximation of Frequent Itemsets Mining in Uncertain Data
    Xu, Jing
    Mao, Xiao-Jiao
    Lu, Wen-Yang
    Zhu, Qi-Hai
    Li, Ning
    Yang, Yu-Bin
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 136 - 144
  • [7] MapReduce-based Data Processing on IoT
    Satoh, Ichiro
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE (ITHINGS) - 2014 IEEE INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) - 2014 IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL-SOCIAL COMPUTING (CPS), 2014, : 161 - 168
  • [8] Knowledge process of health big data using MapReduce-based associative mining
    Choi, So-Young
    Chung, Kyungyong
    [J]. PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
  • [9] A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT
    Gimenez, J. M.
    Fernandez, J. D.
    Martinez, M. A.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (07) : 1270 - 1277
  • [10] Knowledge process of health big data using MapReduce-based associative mining
    So-Young Choi
    Kyungyong Chung
    [J]. Personal and Ubiquitous Computing, 2020, 24 : 571 - 581