Detecting irrelevant subtrees to improve probabilistic learning from tree-structured data

被引:0
|
作者
Habrard, A [1 ]
Bernard, M [1 ]
Sebban, M [1 ]
机构
[1] Univ St Etienne, EURISE, F-42023 St Etienne 2, France
关键词
data reduction; tree-structured data; noisy data; stochastic tree automata;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In front of the large increase of the available amount of structured data (such as XML documents), many algorithms have emerged for dealing with tree-structured data. In this article, we present a probabilistic approach which aims at a priori pruning noisy or irrelevant subtrees in a set of trees. The originality of this approach, in comparison with classic data reduction techniques, comes from the fact that only a part of a tree (i.e. a subtree) can be deleted, rather than the whole tree itself. Our method is based on the use of confidence intervals, on a partition of subtrees, computed according to a given probability distribution. We propose an original approach to assess these intervals on tree-structured data and we experimentally show its interest in the presence of noise.
引用
收藏
页码:103 / 130
页数:28
相关论文
共 50 条
  • [1] Learning Tree-Structured Data in the Model Space
    Dong, Ya-dong
    Lv, Sheng-fei
    [J]. 2016 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI 2016), 2016, : 258 - 266
  • [2] Complete Discovery of Weighted Frequent Subtrees in Tree-Structured Datasets
    AliMohammadzadeh, Rahman
    Zarnani, Ashkan
    Rahgozar, Masoud
    Chehreghani, Mostafa H.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2006, 6 (8A): : 188 - 196
  • [3] Efficient mining of closed induced ordered subtrees in tree-structured databases
    Ozaki, Tomonobu
    Ohkawa, Takenao
    [J]. ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 279 - +
  • [4] Clustering of Tree-structured Data
    Lu, Na
    Wu, Yidan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 1210 - 1215
  • [5] Tree2Vector: Learning a Vectorial Representation for Tree-Structured Data
    Zhang, Haijun
    Wang, Shuang
    Xu, Xiaofei
    Chow, Tommy W. S.
    Wu, Q. M. Jonathan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5304 - 5318
  • [6] eTREE: Learning Tree-structured Embeddings
    Almutairi, Faisal M.
    Wang, Yunlong
    Wang, Dong
    Zhao, Emily
    Sidiropoulos, Nicholas D.
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6609 - 6617
  • [7] Evolution of Multiple Tree Structured Patterns from Tree-Structured Data Using Clustering
    Nagamine, Masatoshi
    Miyahara, Tetsuhiro
    Kuboyama, Tetsuji
    Ueda, Hiroaki
    Takahashi, Kenichi
    [J]. AI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5360 : 500 - +
  • [8] Anonymizing Collections of Tree-Structured Data
    Gkountouna, Olga
    Terrovitis, Manolis
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (08) : 2034 - 2048
  • [9] Substructure search with tree-structured data
    Ozawa, K
    Yasuda, T
    Fujita, S
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (04): : 688 - 695
  • [10] Tree-structured Clustering for Continuous Data
    Huh, Myung-Hoe
    Yang, Kyung-Sook
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2005, 18 (03) : 661 - 671