A Novel Chinese Text Summarization Approach Using Sentence Extraction Based on Kernel Words Recognition

被引:3
|
作者
Yang, Weijie [1 ]
Dai, Ruwei [1 ]
Cui, Xia [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Key Lab Complex Syst & Intelligence Sci, Beijing 100864, Peoples R China
关键词
D O I
10.1109/FSKD.2008.20
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The continuing growth of World Wide Web and on-line text collections makes a large volume of information available to users. Automatic text summarization helps users to quickly understand the documents. This paper proposes an automated technique for Chinese document summarization based on kernel words recognition and discourse segment extraction. This method can be divided into the following five steps. First, the input articles are annotated by lexical analysis. Second, all focused named entities are recognized using a machine learning method. Third, the input articles are divided into several discourse segments, all kernel words of these segments are extracted by the way of rule-based main verbs recognition, and all relations among entities are extracted. Fourth, all important sentence candidates are ranked based on some rules, and redundant sentences are removed based on kernel words information. Finally, several most important sentences are extracted to compose the summarization according to expected compression ratio, and these important sentences are output using a special document as reference. A series of experiments are performed on two Chinese document collections. The results show the superiority of the proposed technique over reference systems.
引用
收藏
页码:134 / 139
页数:6
相关论文
共 50 条
  • [1] A novel Chinese multi-document summarization using clustering based sentence extraction
    Liu, De-Xi
    He, Yan Xiang
    Ji, Dong-Hong
    Yang, Hua
    [J]. PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2592 - +
  • [2] Text Summarization by Sentence Extraction Using Unsupervised Learning
    Garcia-Hernandez, Rene Arnulfo
    Montiel, Romyna
    Ledeneva, Yulia
    Rendon, Erendira
    Gelbukh, Alexander
    Cruz, Rafael
    [J]. MICAI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5317 : 133 - +
  • [3] An Algebraic Approach for Sentence Based Feature Extraction Applied for Automatic Text Summarization
    Batcha, Nowshath Kadhar
    Aziz, Normaziah Abdul
    [J]. ADVANCED SCIENCE LETTERS, 2014, 20 (01) : 139 - 143
  • [4] Automatic Summarization for Chinese Text Based on Combined Words Recognition and Paragraph Clustering
    Jiang Chang-jin
    Peng Hong
    Ma Qian-li
    Chen Jian-chao
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 591 - 594
  • [5] Konkani Text Summarization By Sentence Extraction
    Rodrigues, Sheryl
    Fernandes, Sonia
    Pai, Anusha
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [6] A novel approach for text summarization using optimal combination of sentence scoring methods
    Pradeepika Verma
    Hari Om
    [J]. Sādhanā, 2019, 44
  • [7] A novel approach for text summarization using optimal combination of sentence scoring methods
    Verma, Pradeepika
    Om, Hari
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (05):
  • [8] Multi-document Text Summarization Using Sentence Extraction
    Ahuja, Ravinder
    Anand, Willson
    [J]. ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 235 - 242
  • [9] An approach to sentence-selection-based text summarization
    Chen, F
    Han, KS
    Chen, GL
    [J]. 2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 489 - 493
  • [10] Information-content based sentence extraction for text summarization
    Mallett, D
    Elding, J
    Nascimento, MA
    [J]. ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 2, PROCEEDINGS, 2004, : 214 - 218