Text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm

被引:1
|
作者
Aristoteles [1 ]
Herdiyeni, Yeni [2 ]
Ridha, Ahmad [2 ]
Adisantoso, Julio [2 ]
机构
[1] Department of Computer Science, University of Lampung, Bandar Lampung, 35145, Indonesia
[2] Department of Computer Science, Bogor Agricultural University, Bogor, 16680, Indonesia
来源
关键词
Semantics - Text processing;
D O I
暂无
中图分类号
学科分类号
摘要
This paper aims to perform text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm. There are eleven text features, i.e, sentence position (f1), positive keywords in sentence (f2), negative keywords in sentence (f3), sentence centrality (f4), sentence resemblance to the title (f5), sentence inclusion of name entity (f6), sentence inclusion of numerical data (f7), sentence relative length (f8), bushy path of the node (f9), summation of similarities for each node (f10), and latent semantic feature (f11). We investigate the effect of the first ten sentence features on the summarization task. Then, we use latent semantic feature to increase the accuracy. All feature score functions are used to train a genetic algorithm model to obtain a suitable combination of feature weights. Evaluation of text summarization uses F-measure. The Fmeasure is directly related to the compression rate. The results showed that adding f11 increases the F-measure by 3.26% and 1.55% for compression ratio of 10% and 30%, respectively. On the other hand, it decreases the F-measure by 0.58% for compression ratio of 20%. Analysis of text feature weight showed that only using f2, f4, f5, and f11 can deliver a similar performance using all eleven features. © 2012 International Journal of Computer Science Issues.
引用
收藏
页码:1 / 6
相关论文
共 50 条
  • [1] Topic Summarization of Microblog Document in Bahasa Indonesia using the Phrase Reinforcement Algorithm
    Jiwanggi, Meganingrum Arista
    Adriani, Mirna
    SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 229 - 236
  • [2] Automatic Arabic Text Summarization for Large Scale Multiple Documents Using Genetic Algorithm and MapReduce
    Al Breem, Sulaiman N.
    Baraka, Rebhi S.
    2017 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT), 2017, : 40 - 45
  • [3] Subspace clustering of text documents with feature weighting K-means algorithm
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZ
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 802 - 812
  • [4] Text Summarization Using Hybrid Parallel Genetic Algorithm
    Tang, Xinlai
    Wang, XiaoRong
    ADVANCED MATERIALS AND INFORMATION TECHNOLOGY PROCESSING, PTS 1-3, 2011, 271-273 : 154 - +
  • [5] Text Summarization Using Hybrid Parallel Genetic Algorithm
    Tang, Xinlai
    Wang, XiaoRong
    Wang, Meng
    COMPUTATIONAL MATERIALS SCIENCE, PTS 1-3, 2011, 268-270 : 1073 - +
  • [6] Spelling Correction for Text Documents in Bahasa Indonesia Using Finite State Automata and Levinshtein Distance Method
    Mawardi, Viny Christanti
    Susanto, Niko
    Naga, Dali Santun
    3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION (ICESTI 2017), 2018, 164
  • [7] Automatic Text Summarization Using Genetic Algorithm and Repetitive Patterns
    Heidary, Ebrahim
    Parvin, Hamid
    Nejatian, Samad
    Bagherifard, Karamollah
    Rezaie, Vahideh
    Mansor, Zulkefli
    Kim-Hung Pho
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 67 (01): : 1085 - 1101
  • [8] AUTOMATIC TEXT SUMMARIZATION USING PAGE RANK AND GENETIC ALGORITHM
    Gupta, Shashank
    Jagrawal, Anushree
    Mathur, Neha
    JOURNAL OF RAJASTHAN ACADEMY OF PHYSICAL SCIENCES, 2014, 13 (02): : 171 - 179
  • [9] Multiple documents summarization based on genetic algorithm
    Liu, Derong
    Wang, Yongcheng
    Liu, Chuanhan
    Wang, Zhiqi
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4223 : 355 - 364
  • [10] A Text Classification Algorithm based on Feature Weighting
    Yang, Han
    Cui, Honggang
    Tang, Hao
    GREEN ENERGY AND SUSTAINABLE DEVELOPMENT I, 2017, 1864