A polynomial time matching algorithm of structured ordered tree patterns for data mining from semistructured data

被引:0
|
作者
Suzuki, Y [1 ]
Inomae, K
Shoudai, T
Miyahara, T
Uchida, T
机构
[1] Kyushu Univ, Dept Informat, Kasuga, Fukuoka 8168580, Japan
[2] Hiroshima City Univ, Fac Informat Sci, Hiroshima 7313194, Japan
来源
INDUCTIVE LOGIC PROGRAMMING | 2003年 / 2583卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. Knowledge representations for tree structured data are quite important to discover interesting features which such tree structured data have. In this paper, as a representation of structural features we propose a structured ordered tree pattern, called a term tree, which is a rooted tree pattern consisting of ordered children and structured variables. A variable in a term tree can be substituted by an arbitrary tree. Deciding whether or not each given tree structured data has structural features is a core problem for data mining of large tree structured data. We consider a problem of deciding whether or not a term tree t matches a tree T, that is, T is obtained from t by substituting some trees for variables in t. Such a problem is called a membership problem for t and T. Given a term tree t and a tree T, we present an O(nN) time algorithm of solving the membership problem for t and T, where n and N are the numbers of vertices in t and T, respectively. We also report some experiments on applying our matching algorithm to a collection of real Web documents.
引用
收藏
页码:270 / 284
页数:15
相关论文
共 50 条
  • [1] A polynomial time matching algorithm of ordered tree patterns having height-constrained variables
    Aikou, K
    Suzuki, Y
    Shoudai, T
    Uchida, T
    Miyahara, T
    [J]. COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2005, 3537 : 346 - 357
  • [2] Mining is-part-of association patterns from semistructured data
    Wang, K
    Liu, HQ
    [J]. KNOWLEDGE MANAGEMENT & INTELLIGENT ENTERPRISES, 2001, : 189 - 204
  • [3] Evolution of characteristic tree structured patterns from semistructured documents
    Inata, Katsushi
    Miyahara, Tetsuhiro
    Ueda, Hiroaki
    Takahashi, Kenichi
    [J]. AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 1201 - +
  • [4] Polynomial time matching algorithms for tree-like structured patterns in Knowledge Discovery
    Miyahara, T
    Shoudai, T
    Uchida, T
    Takahashi, K
    Ueda, H
    [J]. KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS: CURRENT ISSUES AND NEW APPLICATIONS, 2000, 1805 : 5 - 16
  • [5] Extraction of tag tree patterns with contractible variables from irregular semistructured data
    Miyahara, T
    Suzuki, Y
    Shoudai, T
    Uchida, T
    Hirokawa, S
    Takahashi, K
    Ueda, H
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 430 - 436
  • [6] Polynomial Time Inductive Inference of Languages of Ordered Term Tree Patterns with Height-Constrained Variables from Positive Data
    Shoudai, Takayoshi
    Aikoh, Kazuhide
    Suzuki, Yusuke
    Matsumoto, Satoshi
    Miyahara, Tetsuhiro
    Uchida, Tomoyuki
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (03) : 785 - 802
  • [7] Ordered term tree languages which are polynomial time inductively inferable from positive data
    Suzuki, Y
    Shoudai, T
    Uchida, T
    Miyahara, T
    [J]. THEORETICAL COMPUTER SCIENCE, 2006, 350 (01) : 63 - 90
  • [8] Ordered term tree languages which are polynomial time inductively inferable from positive data
    Suzuki, Y
    Shoudai, T
    Uchida, T
    Miyahara, T
    [J]. ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2002, 2533 : 188 - 202
  • [9] An Efficient Pattern Matching Algorithm for Ordered Term Tree Patterns
    Suzuki, Yusuke
    Shoudai, Takayoshi
    Uchida, Tomoyuki
    Miyahara, Tetsuhiro
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2015, E98A (06) : 1197 - 1211
  • [10] On subtyping of tree-structured data: A polynomial approach
    Bry, F
    Drabent, W
    Maluszynski, J
    [J]. PRINCIPLES AND PRACTICE OF SEMANTIC WEB REASONING, PROCEEDINGS, 2004, 3208 : 1 - 18