Detecting irrelevant subtrees to improve probabilistic learning from tree-structured data

被引:0
|
作者
Habrard, A [1 ]
Bernard, M [1 ]
Sebban, M [1 ]
机构
[1] Univ St Etienne, EURISE, F-42023 St Etienne 2, France
关键词
data reduction; tree-structured data; noisy data; stochastic tree automata;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In front of the large increase of the available amount of structured data (such as XML documents), many algorithms have emerged for dealing with tree-structured data. In this article, we present a probabilistic approach which aims at a priori pruning noisy or irrelevant subtrees in a set of trees. The originality of this approach, in comparison with classic data reduction techniques, comes from the fact that only a part of a tree (i.e. a subtree) can be deleted, rather than the whole tree itself. Our method is based on the use of confidence intervals, on a partition of subtrees, computed according to a given probability distribution. We propose an original approach to assess these intervals on tree-structured data and we experimentally show its interest in the presence of noise.
引用
收藏
页码:103 / 130
页数:28
相关论文
共 50 条
  • [41] Scaling Similarity Joins over Tree-Structured Data
    Tang, Yu
    Cai, Yilun
    Mamoulis, Nikos
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (11): : 1130 - 1141
  • [42] Predictive Learning on Hidden Tree-Structured Ising Models
    Nikolakakis, Konstantinos E.
    Kalogerias, Dionysios S.
    Sarwate, Anand D.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [43] Structure-Preserving Hashing for Tree-Structured Data
    Xu, Zhi
    Niu, Lushuai
    Ji, Jianqiu
    Li, Qinlin
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (08) : 2045 - 2053
  • [44] Reasoning about integrity constraints for tree-structured data
    Czerwinski, Wojciech
    David, Claire
    Murlak, Filip
    Parys, Pawel
    [J]. THEORY OF COMPUTING SYSTEMS, 2018, 62 (04) : 941 - 976
  • [45] DISAGGREGATE TREE-STRUCTURED MODELING OF CONSUMER CHOICE DATA
    CURRIM, IS
    MEYER, RJ
    LE, NT
    [J]. JOURNAL OF MARKETING RESEARCH, 1988, 25 (03) : 253 - 265
  • [46] Detecting Temporal Proposal for Action Localization with Tree-structured Search Policy
    Jiang, Xinyang
    Tang, Siliang
    Yang, Yang
    Zhao, Zhou
    Zhang, Yin
    Wu, Fei
    Zhuang, Yueting
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1069 - 1077
  • [47] A Distributed Placement Service for Graph-Structured and Tree-Structured Data
    Buehrer, Gregory
    Parthasarathy, Srinivasan
    Tatikonda, Shirish
    [J]. PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, : 355 - 356
  • [48] A tree-structured query interface for querying semi-structured data
    Newman, S
    Özsoyoglu, ZM
    [J]. 16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2004, : 127 - 130
  • [49] Redactable Signatures for Tree-Structured Data: Definitions and Constructions
    Brzuska, Christina
    Busch, Heike
    Dagdelen, Oezguer
    Fischlin, Marc
    Franz, Martin
    Katzenbeisser, Stefan
    Manulis, Mark
    Onete, Cristina
    Peter, Andreas
    Poettering, Bertram
    Schroeder, Dominique
    [J]. APPLIED CRYPTOGRAPHY AND NETWORK SECURITY, 2010, 6123 : 87 - 104
  • [50] A Distributed Placement Service for Graph-Structured and Tree-Structured Data
    Buehrer, Gregory
    Parthasarathy, Srinivasan
    Tatikonda, Shirish
    [J]. ACM SIGPLAN NOTICES, 2010, 45 (05) : 355 - 356