Corpus-based semantic role approach in information retrieval

被引:24
|
作者
Moreda, Palorna [1 ]
Navarro, Borja [1 ]
Palomar, Manuel [1 ]
机构
[1] Univ Alicante, Dept Software & Comp Syst, Nat Language Proc & Informat Syst Grp, E-03080 Alicante, Spain
关键词
semantic roles; information retrieval systems; corpus-based methods; feature selection procedure; word sense disambiguation; shallow parsing; PoS tag; lemma;
D O I
10.1016/j.datak.2006.06.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a method to determine the semantic role for the constituents of a sentence is presented. This method, named SemRol, is a corpus-based approach that uses two different statistical models, conditional Maximum Entropy (ME) Probability Models and the TiMBL program, a Memory-based Learning. It consists of three phases that make use of features using words, lemmas, PoS tags and shallow parsing information. Our method introduces a new phase in the Semantic Role Labeling task which has usually been approached as a two phase procedure consisting of recognition and labeling arguments. From our point of view, firstly the sense of the verbs in the sentences must be disambiguated. That is why depending on the sense of the verb a different set of roles must be considered. Regarding the labeling arguments phase, a tuning procedure is presented. As a result of this procedure one of the best sets of features for the labeling arguments task is detected. With this set, that is different for TiMBL and ME, precisions of 76.71% for TiMBL or 70.55% for ME, are obtained. Furthermore, the semantic role information provided by our SemRol method could be used as an extension of Information Retrieval or Question Answering systems. We propose using this semantic information as an extension of an Information Retrieval system in order to reduce the number of documents or passages retrieved by the system. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:467 / 483
页数:17
相关论文
共 50 条
  • [21] Corpus-based Semantic Relatedness for the Construction of Polish WordNet
    Broda, Bartosz
    Derwojedowa, Magdalena
    Piasecki, Maciej
    Szpakowicz, Stanislaw
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1800 - 1807
  • [22] Semantic associations in business English: A corpus-based analysis
    Nelson, M
    [J]. ENGLISH FOR SPECIFIC PURPOSES, 2006, 25 (02) : 217 - 234
  • [23] Syntactic priming: A corpus-based approach
    Gries, ST
    [J]. JOURNAL OF PSYCHOLINGUISTIC RESEARCH, 2005, 34 (04) : 365 - 399
  • [24] Strudel: A Corpus-Based Semantic Model Based on Properties and Types
    Baroni, Marco
    Murphy, Brian
    Barbu, Eduard
    Poesio, Massimo
    [J]. COGNITIVE SCIENCE, 2010, 34 (02) : 222 - 254
  • [25] A Corpus-based View of Semantic Prosody in Business English
    Li Zeying
    [J]. 2012 INTERNATIONAL CONFERENCE ON EDUCATION REFORM AND MANAGEMENT INNOVATION (ERMI 2012), VOL 5, 2013, : 293 - 298
  • [26] A corpus-based approach to mind style
    McIntyre, Dan
    Archer, Dawn
    [J]. JOURNAL OF LITERARY SEMANTICS, 2022, 51 : 167 - 182
  • [27] A Corpus-Based Approach to the Teaching of English
    Klimova, Blanka
    [J]. 2015 4TH INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES AND SOCIETY (ICSSS 2015), PT 1, 2015, 70 : 3 - 7
  • [28] A Corpus-Based Approach to the Study of Collocation
    Li, Yu-Xian
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED EDUCATION AND MANAGEMENT (ICAEM 2015), 2015, : 179 - 183
  • [29] Syntactic Priming: A Corpus-based Approach
    Stefan Th. Gries
    [J]. Journal of Psycholinguistic Research, 2005, 34 : 365 - 399
  • [30] A CORPUS-BASED APPROACH TO TAHLTAN STRESS
    Alderete, John
    Bob, Tanya
    [J]. ATHABASKAN PROSODY, 2005, 269 : 381 - 403