Text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm

被引:1
|
作者
Aristoteles [1 ]
Herdiyeni, Yeni [2 ]
Ridha, Ahmad [2 ]
Adisantoso, Julio [2 ]
机构
[1] Department of Computer Science, University of Lampung, Bandar Lampung, 35145, Indonesia
[2] Department of Computer Science, Bogor Agricultural University, Bogor, 16680, Indonesia
来源
International Journal of Computer Science Issues | 2012年 / 9卷 / 03期
关键词
Semantics - Text processing;
D O I
暂无
中图分类号
学科分类号
摘要
This paper aims to perform text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm. There are eleven text features, i.e, sentence position (f1), positive keywords in sentence (f2), negative keywords in sentence (f3), sentence centrality (f4), sentence resemblance to the title (f5), sentence inclusion of name entity (f6), sentence inclusion of numerical data (f7), sentence relative length (f8), bushy path of the node (f9), summation of similarities for each node (f10), and latent semantic feature (f11). We investigate the effect of the first ten sentence features on the summarization task. Then, we use latent semantic feature to increase the accuracy. All feature score functions are used to train a genetic algorithm model to obtain a suitable combination of feature weights. Evaluation of text summarization uses F-measure. The Fmeasure is directly related to the compression rate. The results showed that adding f11 increases the F-measure by 3.26% and 1.55% for compression ratio of 10% and 30%, respectively. On the other hand, it decreases the F-measure by 0.58% for compression ratio of 20%. Analysis of text feature weight showed that only using f2, f4, f5, and f11 can deliver a similar performance using all eleven features. © 2012 International Journal of Computer Science Issues.
引用
收藏
页码:1 / 6
相关论文
共 50 条
  • [41] USING GENETIC ALGORITHMS WITH LEXICAL CHAINS FOR AUTOMATIC TEXT SUMMARIZATION
    Berker, Mine
    Guengoer, Tunga
    ICAART: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2012, : 595 - 600
  • [42] Single Document Extractive Text Summarization Using Genetic Algorithms
    Chatterjee, Niladri
    Mittal, Amol
    Goyal, Shubham
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 19 - 23
  • [43] Using Clustering and a Modified Classification algorithm for automatic text summarization
    Aries, Abdelkrime
    Oufaida, Houda
    Nouali, Omar
    DOCUMENT RECOGNITION AND RETRIEVAL XX, 2013, 8658
  • [44] Extractive Arabic Text Summarization Using Modified PageRank Algorithm
    Elbarougy, Reda
    Behery, Gamal
    El Khatib, Akram
    EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (02) : 73 - 81
  • [45] A Text Summarization Hybrid Approach Using CNN and the Firefly Algorithm
    Prathap G.
    Rathinasabapathy R.
    SN Computer Science, 5 (1)
  • [46] Generic and Update Multi-Document Text Summarization based on Genetic Algorithm
    Neri-Mendoza, Veronica
    Ledeneva, Yulia
    Arnulfo Garcia-Hernandez, Rene
    Hernandez-Castaneda, Angel
    COMPUTACION Y SISTEMAS, 2023, 27 (01): : 269 - 279
  • [47] An Automatic Text Summarization using Text Features and Singular Value Decomposition for Popular Articles in Indonesia Language
    Gunawan, Fergyanto E.
    Juandi, Adrian Victor
    Soewito, Benfano
    2015 INTERNATIONAL SEMINAR ON INTELLIGENT TECHNOLOGY AND ITS APPLICATIONS (ISITIA), 2015, : 27 - 32
  • [48] A Heuristic Feature Selection Approach for Text Categorization by Using Chaos Optimization and Genetic Algorithm
    Chen, Hao
    Jiang, Wen
    Li, Canbing
    Li, Rui
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [49] Indonesian News Articles Summarization Using Genetic Algorithm
    Khotimah, Nurul
    Girsang, Abba Suganda
    ENGINEERING LETTERS, 2022, 30 (01) : 152 - 160
  • [50] Feature Weighting Improvement of Web Text Categorization Based on Particle Swarm Optimization Algorithm
    Lu, Yonghe
    Peng, Yanhong
    JOURNAL OF COMPUTERS, 2015, 10 (04) : 260 - 267