Parsa: An open information extraction system for Persian

被引:3
|
作者
Rahat, Mahmoud [1 ]
Talebpour, Alireza [1 ]
机构
[1] Shahid Beheshti Univ, Fac Comp Sci & Engn, Tehran, Iran
关键词
D O I
10.1093/llc/fqy003
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
This article presents Parsa as an open information extraction (OIE) system for Persian. Comparing with advanced English approaches, OIE has just started to develop in other languages. Existing systems apply information about the grammar and syntactic structures of the target language to gain domain independence (which is a key goal in OIE). To improve modeling these complex structures, Parsa introduces a novel set of Patterns based on tree format. The patterns also enable Parsa to define POS tags, and lexical constraints to reduce incorrect matches. Each Tree Pattern is placed inside a Package based on its type and priority. The Packages help Parsa to alleviate some challenges in processing Persian like null-subject problem and uninformative extraction. To make the extraction process simple and coherent, we separate matching template from extraction template. An efficient algorithm for matching patterns inside dependency parse of a sentence is presented as well. Our experiments showed that Parsa achieves better performance than the state of the art systems in Persian, and highly comparable with the existing approaches in English.
引用
收藏
页码:874 / 893
页数:20
相关论文
共 50 条
  • [1] RePersian:An Efficient Open Information Extraction Tool in Persian
    Saheb-Nassagh, Raana
    Asgari, Majid
    Minaei-Bidgoli, Behrouz
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 93 - 99
  • [2] A new representation of open information extraction in Persian language
    Nematollahi, Mohammad Mahdi
    Marouzi, Omid Reza
    [J]. INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2019, 10 (02): : 189 - 196
  • [3] A recursive algorithm for open information extraction from Persian texts
    Rahat, Mahmoud
    Talebpour, Alireza
    Monemian, Seyedamin
    [J]. INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2018, 57 (03) : 193 - 206
  • [4] Open information extraction as an intermediate semantic structure for Persian text summarization
    Rahat, Mahmoud
    Talebpour, Alireza
    [J]. INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2018, 19 (04) : 339 - 352
  • [5] APRCOIE: An open information extraction system for Chinese
    Liao, Yan
    Hua, Jialin
    Luo, Liangqing
    Ping, Weiying
    Lu, Xuewen
    Zhong, Yuansheng
    [J]. SOFTWAREX, 2024, 26
  • [6] An Open Information Extraction For Question Answering System
    Thenmozhi, D.
    Kumar, G. Ravi
    [J]. 2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 82 - 86
  • [7] PERSEPOLIS - ARCHAEOLOGY OF PARSA, SEAT OF PERSIAN KINGS - WILBER,DN
    HARPER, PO
    [J]. ARCHAEOLOGY, 1971, 24 (03) : 291 - 292
  • [8] Phrase-based Clause Extraction for Open Information Extraction System
    Romadhony, Ade
    Widyantoro, Dwi H.
    Purwarianti, Ayu
    [J]. 2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2015, : 155 - 162
  • [9] An Open Relation Extraction System for Web Text Information
    Li, Huagang
    Liu, Bo
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (11):
  • [10] InferPortOIE: A Portuguese Open Information Extraction system with inferences
    Lima Sena, Cleiton Fernando
    Claro, Daniela Barreiro
    [J]. NATURAL LANGUAGE ENGINEERING, 2019, 25 (02) : 287 - 306