Contextual text representation for unsupervised knowledge discovery in texts

被引:0
|
作者
Perrin, P [1 ]
Petry, F [1 ]
机构
[1] Tulane Univ, Dept Comp Sci, New Orleans, LA 70118 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the role of lexical contextual relations for the problem of unsupervised knowledge discovery in full texts. Narrative texts have inherent structure dictated by language usage in generating them. We suggest that the relative distance of terms within a text gives sufficient information about its structure and its relevant content. Furthermore, this structure can be used to discover implicit knowledge embedded in the text, therefore serving as a good candidate to represent effectively the text content for knowledge elicitation tasks. We qualitatively demonstrate that a useful text structure and content can be systematically extracted by collocational lexical analysis without the need to encode any supplemental sources of knowledge. We present an algorithm that systematically extracts the most relevant facts in the texts and labels them by their overall theme, dictated by local contextual information. It exploits domain independent lexical frequencies and mutual information measures to find the relevant contextual units in the texts. We report results from experiments in a real-world textual database of psychiatric evaluation reports.
引用
收藏
页码:246 / 257
页数:12
相关论文
共 50 条
  • [1] Extraction and representation of contextual information for knowledge discovery in texts
    Perrin, P
    Petry, FE
    [J]. INFORMATION SCIENCES, 2003, 151 : 125 - 152
  • [2] Graph-based Text Representation and Knowledge Discovery
    Jin, Wei
    Srihari, Rohini K.
    [J]. APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 807 - 811
  • [3] A Graph-Based Method for Unsupervised Knowledge Discovery from Financial Texts
    Oksanen, Joel
    Majumder, Abhilash
    Saunack, Kumar
    Toni, Francesca
    Dhondiyal, Arun
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5412 - 5417
  • [4] Representation techniques of texts for unsupervised classification of documents
    Cobo, German
    Sevillano, Xavier
    Alias, Francesc
    Claudi Socoro, Joan
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2006, (37): : 329 - 336
  • [5] Knowledge discovery in multidimensional knowledge representation frameworkAn integrative approach for the visualization of text analytics results
    Johannes Zenkert
    André Klahold
    Madjid Fathi
    [J]. Iran Journal of Computer Science, 2018, 1 (4) : 199 - 216
  • [6] Learning unsupervised contextual representations for medical synonym discovery
    Schumacher, Elliot
    Dredze, Mark
    [J]. JAMIA OPEN, 2019, 2 (04) : 538 - 546
  • [7] Unsupervised Leraning for Sematic Representation of Short Text
    Dong, Chenxi
    Jia, Haoran
    Wang, Cong
    [J]. PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 475 - 478
  • [8] Knowledge Representation in TOEFL Expository Texts
    Contreras Gonzalez, Meliza
    Tovar Vidal, Mireya
    De Ita Luna, Guillermo
    Lopez Lopez, Aurelio
    [J]. COMPUTACION Y SISTEMAS, 2020, 24 (02): : 511 - 522
  • [9] AN EPISODIC KNOWLEDGE REPRESENTATION FOR NARRATIVE TEXTS
    SCHUBERT, LK
    HWANG, CH
    [J]. PROCEEDINGS OF THE FIRST CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 1989, : 444 - 458
  • [10] Textual navigation: representation of texts and knowledge
    Couto, Javier
    Minel, Jean-Luc
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2006, 47 (02): : 225 - 254