Evolutionary Learning of Syntax Patterns for Genic Interaction Extraction

被引:4
|
作者
Bartoli, Alberto [1 ]
De Lorenzo, Andrea [1 ]
Medvet, Eric [1 ]
Tarlao, Fabiano [1 ]
Virgolin, Marco [1 ]
机构
[1] Univ Trieste, DIA, I-34127 Trieste, Italy
关键词
Regular Expressions; Genetic Programming; Programming by Example; Machine Learning;
D O I
10.1145/2739480.2754706
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniques due to the huge and ever increasing amount of scientific publications describing observed phenomena of potential clinical interest. In this paper, we consider the problem of automatically identifying sentences that contain interactions between genes and proteins, based solely on a dictionary of genes and proteins and a small set of sample sentences in natural language. We propose an evolutionary technique for learning a classifier that is capable of detecting the desired sentences within scientific publications with high accuracy. The key feature of our proposal, that is internally based on Genetic Programming, is the construction of a model of the relevant syntax patterns in terms of standard part-of-speech annotations. The model consists of a set of regular expressions that are learned automatically despite the large alphabet size involved. We assess our approach on two realistic datasets and obtain 74% accuracy, a value sufficiently high to be of practical interest and that is in line with significant baseline methods.
引用
收藏
页码:1183 / 1190
页数:8
相关论文
共 50 条
  • [41] Extraction of Genic Interactions with the Recursive Logical Theory of an Ontology
    Manine, Alain-Pierre
    Alphonse, Erick
    Bessieres, Philippe
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2010, 6008 : 549 - +
  • [42] A research program in neuroimaging for an evolutionary theory of syntax
    Tettamanti, Marco
    LANGUAGE AND COGNITION, 2013, 5 (2-3) : 157 - 166
  • [43] SYNTAX DIRECTED GRAPHICAL INTERACTION
    OLSEN, DR
    DEMPSEY, EP
    SIGPLAN NOTICES, 1983, 18 (06): : 112 - 117
  • [44] Projection and Minimalistic Syntax in Interaction
    Auer, Peter
    DISCOURSE PROCESSES, 2009, 46 (2-3) : 180 - 205
  • [45] Syntax-Based Collocation Extraction
    Pecina, Pavel
    COMPUTATIONAL LINGUISTICS, 2011, 37 (03) : 631 - 633
  • [46] Syntax-Based Collocation Extraction
    Villavicencio, Aline
    NATURAL LANGUAGE ENGINEERING, 2012, 18 : 575 - 579
  • [47] Syntax-Based Collocation Extraction
    Tutin, Agnes
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2011, 52 (03): : 288 - 292
  • [48] Syntax-Based Collocation Extraction
    Williams, Geoffrey
    INTERNATIONAL JOURNAL OF LEXICOGRAPHY, 2013, 26 (01) : 90 - 94
  • [49] Genetic & Evolutionary Biometrics: Feature Extraction from a Machine Learning Perspective
    Shelton, Joseph
    Alford, Aniesha
    Small, Lasanio
    Leflore, Derrick
    Williams, Jared
    Adams, Joshua
    Dozier, Gerry
    Bryant, Kelvin
    Abegaz, Tamirat
    Ricanek, Karl
    2012 PROCEEDINGS OF IEEE SOUTHEASTCON, 2012,
  • [50] Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs)
    Dermitzakis, ET
    Reymond, A
    Scamuffa, N
    Ucla, C
    Kirkness, E
    Rossier, C
    Antonarakis, SE
    SCIENCE, 2003, 302 (5647) : 1033 - 1035