Learning Restricted Deterministic Regular Expressions with Counting

被引:0
|
作者
Wang, Xiaofan [1 ,2 ]
Chen, Haiming [1 ]
机构
[1] Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Schema inference; Regular expressions; Counting; Descriptive generalization;
D O I
10.1007/978-3-030-34223-4_7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Regular expressions are widely used in various fields. Learning regular expressions from sequence data is still a popular topic. Since many XML documents are not accompanied by a schema, or a valid schema, learning regular expressions from XML documents becomes an essential work. In this paper, we propose a restricted subclass of single-occurrence regular expressions with counting (RCsores) and give a learning algorithm of RCsores. First, we learn a single-occurrence regular expressions (SORE). Then, we construct an equivalent countable finite automaton (CFA). Next, the CFA runs on the given finite sample to obtain an updated CFA, which contains counting operators occurring in an RCsore. Finally we transform the updated CFA to an RCsore. More-over, our algorithm can ensure the result is a minimal generalization (such generalization is called descriptive) of the given finite sample.
引用
收藏
页码:98 / 114
页数:17
相关论文
共 50 条
  • [1] Learning a Subclass of Deterministic Regular Expression with Counting
    Wang, Xiaofan
    Chen, Haiming
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 341 - 348
  • [2] Fast Learning of Restricted Regular Expressions and DTDs
    Freydenberger, Dominik D.
    Koetzing, Timo
    THEORY OF COMPUTING SYSTEMS, 2015, 57 (04) : 1114 - 1158
  • [3] Fast Learning of Restricted Regular Expressions and DTDs
    Dominik D. Freydenberger
    Timo Kötzing
    Theory of Computing Systems, 2015, 57 : 1114 - 1158
  • [4] Deterministic Regular Expressions with Interleaving
    Peng, Feifei
    Chen, Haiming
    Mou, Xiaoying
    THEORETICAL ASPECTS OF COMPUTING - ICTAC 2015, 2015, 9399 : 203 - 220
  • [5] LEARNING A CLASS OF REGULAR EXPRESSIONS VIA RESTRICTED SUBSET QUERIES
    KINBER, E
    LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, 1992, 642 : 232 - 243
  • [6] Learning Restricted Regular Expressions with Interleaving from XML Data
    Li, Yeting
    Zhang, Xiaolan
    Xu, Han
    Mou, Xiaoying
    Chen, Haiming
    CONCEPTUAL MODELING, ER 2018, 2018, 11157 : 586 - 593
  • [7] Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data
    Bex, Geert Jan
    Gelade, Wouter
    Neven, Frank
    Vansummeren, Stijn
    ACM TRANSACTIONS ON THE WEB, 2010, 4 (04)
  • [8] Descriptional Complexity of Deterministic Regular Expressions
    Losemann, Katja
    Martens, Wim
    Niewerth, Matthias
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2012, 2012, 7464 : 643 - 654
  • [9] Deciding Definability by Deterministic Regular Expressions
    Czerwinski, Wojciech
    David, Claire
    Losemann, Katja
    Martens, Wim
    FOUNDATIONS OF SOFTWARE SCIENCE AND COMPUTATION STRUCTURES (FOSSACS 2013), 2013, 7794 : 289 - 304
  • [10] FROM REGULAR EXPRESSIONS TO DETERMINISTIC AUTOMATA
    BERRY, G
    SETHI, R
    THEORETICAL COMPUTER SCIENCE, 1986, 48 (01) : 117 - 126