Stochastic grammatical inference of text database structure

被引:14
|
作者
Young-Lai, M [1 ]
Tompa, FW [1 ]
机构
[1] Univ Waterloo, Dept Comp Sci, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
stochastic grammatical inference; text database structure;
D O I
10.1023/A:1007653929870
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For a document collection in which structural elements are identified with markup, it is often necessary to construct a grammar retrospectively that constrains element nesting and ordering. This has been addressed by others as an application of grammatical inference. We describe an approach based on stochastic grammatical inference which scales more naturally to large data sets and produces models with richer semantics. We adopt an algorithm that produces stochastic finite automata and describe modifications that enable better interactive control of results. Our experimental evaluation uses four document collections with varying structure.
引用
收藏
页码:111 / 137
页数:27
相关论文
共 50 条
  • [1] Stochastic Grammatical Inference of Text Database Structure
    Matthew Young-Lai
    Frank WM. Tompa
    Machine Learning, 2000, 40 : 111 - 137
  • [2] Stochastic grammatical inference with multinomial tests
    Kermorvant, C
    Dupont, P
    GRAMMATICAL INFERENCE: ALGORITHMS AND APPLICATIONS, 2002, 2484 : 149 - 160
  • [3] Using Grammatical Inference for structure induction
    Saidi, Alexandre S.
    CIC 2006: 15th International Conference on Computing, Proceedings, 2006, : 92 - 102
  • [4] Different approaches to bilingual text classification based on grammatical inference techniques
    Civera, J
    Cubel, E
    Juan, A
    Vidal, E
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2005, 3523 : 630 - 637
  • [5] Recognition of text with known geometric and grammatical structure
    Rathousky, Jan
    Urban, Martin
    Franc, Vojtech
    VISAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2008, : 194 - +
  • [6] Stochastic Directly-Follows Process Discovery Using Grammatical Inference
    Alkhammash, Hanan
    Polyvyanyy, Artem
    Moffat, Alistair
    ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2024, 2024, 14663 : 87 - 103
  • [7] Using pseudo-stochastic rational languages in probabilistic grammatical inference
    Habrard, Amaury
    Denis, Francois
    Esposito, Yann
    GRAMMATICAL INFERENCE: ALGORITHMS AND APPLICATIONS, PROCEEDINGS, 2006, 4201 : 112 - 124
  • [8] Understanding Sparse Topical Structure of Short Text via Stochastic Variational-Gibbs Inference
    Lin, Tianyi
    Zhang, Siyuan
    Cheng, Hong
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 407 - 416
  • [9] Grammatical Inference Preface
    Eyraud, Remi
    de la Higuera, Colin
    Kanazawa, Makoto
    Yoshinaka, Ryo
    FUNDAMENTA INFORMATICAE, 2016, 146 (04) : I - II