Open information extraction as an intermediate semantic structure for Persian text summarization

被引:3
|
作者
Rahat, Mahmoud [1 ]
Talebpour, Alireza [1 ]
机构
[1] Shahid Beheshti Univ, Fac Comp Sci & Engn, Tehran, Iran
关键词
Text summarization; Extractive summary; Open information extraction; Persian (Farsi) text processing;
D O I
10.1007/s00799-018-0244-z
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Semantic applications typically exploit structures such as dependency parse trees, phrase-chunking, semantic role labeling or open information extraction. In this paper, we introduce a novel application of Open IE as an intermediate layer for text summarization. Text summarization is an important method for providing relevant information in large digital libraries. Open IE is referred to the process of extracting machine-understandable structural propositions from text. We use these propositions as a building block to shorten the sentence and generate a summary of the text. The proposed system offers a new form of summarization that is able to break the structure of the sentence and extract the most significant sub-sentential elements. Other advantages include the ability to identify and eliminate less important sections of the sentence (such as adverbs, adjectives, appositions or dependent clauses), or duplicate pieces of sentences which in turn opens up the space for entering more sentences in the summary to enhance the coverage and coherency of it. The proposed system is localized for Persian language; however, it can be adopted to other languages. Experiments performed on a standard data set Pasokh with a standard comparison tool showed promising results for the proposed approach. We used summaries produced by the system in a real-world application in the virtual library of Shahid Beheshti University and received good feedbacks from users.
引用
收藏
页码:339 / 352
页数:14
相关论文
共 50 条
  • [41] A global and local information extraction model incorporating selection mechanism for abstractive text summarization
    Yuanyuan Li
    Yuan Huang
    Weijian Huang
    Wei Wang
    [J]. Multimedia Tools and Applications, 2024, 83 : 4859 - 4886
  • [42] Long Text Summarization and Key Information Extraction in a Multi-Task Learning Framework
    Lu, Ming
    Chen, Rongfa
    [J]. Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [43] Abstractive Text Summarization Based on Semantic Alignment Network
    Wu, Shixin
    Huang, Degen
    Li, Jiuyi
    [J]. Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57 (01): : 1 - 6
  • [44] Semantic similarity and text summarization based novelty detection
    Kumar, Sushil
    Bhatia, Komal Kumar
    [J]. SN APPLIED SCIENCES, 2020, 2 (03):
  • [45] Automatic text summarization using latent semantic analysis
    I. V. Mashechkin
    M. I. Petrovskiy
    D. S. Popov
    D. V. Tsarev
    [J]. Programming and Computer Software, 2011, 37 : 299 - 305
  • [46] Automatic Text Summarization Using Latent Semantic Analysis
    Mashechkin, I. V.
    Petrovskiy, M. I.
    Popov, D. S.
    Tsarev, D. V.
    [J]. PROGRAMMING AND COMPUTER SOFTWARE, 2011, 37 (06) : 299 - 305
  • [47] Text summarization evaluation using semantic probability distributions
    Le, Anh
    Wu, Fred
    Vu, Lan
    Le, Thanh
    [J]. 2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 207 - 212
  • [48] A Novel Framework for Semantic Oriented Abstractive Text Summarization
    Moratanch, N.
    Chitrakala, S.
    [J]. JOURNAL OF WEB ENGINEERING, 2018, 17 (08): : 675 - 716
  • [49] Syntactic and Semantic-driven Learning for Open Information Extraction
    Tang, Jialong
    Lu, Yaojie
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    Xiao, Xinyan
    Wu, Hua
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 782 - 792
  • [50] KANNADA TEXT SUMMARIZATION USING LATENT SEMANTIC ANALYSIS
    Geetha, J. K.
    Deepamala, N.
    [J]. 2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1508 - 1512