Text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm

被引:1
|
作者
Aristoteles [1 ]
Herdiyeni, Yeni [2 ]
Ridha, Ahmad [2 ]
Adisantoso, Julio [2 ]
机构
[1] Department of Computer Science, University of Lampung, Bandar Lampung, 35145, Indonesia
[2] Department of Computer Science, Bogor Agricultural University, Bogor, 16680, Indonesia
来源
International Journal of Computer Science Issues | 2012年 / 9卷 / 03期
关键词
Semantics - Text processing;
D O I
暂无
中图分类号
学科分类号
摘要
This paper aims to perform text feature weighting for summarization of documents in bahasa Indonesia using genetic algorithm. There are eleven text features, i.e, sentence position (f1), positive keywords in sentence (f2), negative keywords in sentence (f3), sentence centrality (f4), sentence resemblance to the title (f5), sentence inclusion of name entity (f6), sentence inclusion of numerical data (f7), sentence relative length (f8), bushy path of the node (f9), summation of similarities for each node (f10), and latent semantic feature (f11). We investigate the effect of the first ten sentence features on the summarization task. Then, we use latent semantic feature to increase the accuracy. All feature score functions are used to train a genetic algorithm model to obtain a suitable combination of feature weights. Evaluation of text summarization uses F-measure. The Fmeasure is directly related to the compression rate. The results showed that adding f11 increases the F-measure by 3.26% and 1.55% for compression ratio of 10% and 30%, respectively. On the other hand, it decreases the F-measure by 0.58% for compression ratio of 20%. Analysis of text feature weight showed that only using f2, f4, f5, and f11 can deliver a similar performance using all eleven features. © 2012 International Journal of Computer Science Issues.
引用
收藏
页码:1 / 6
相关论文
共 50 条
  • [31] K Nearest Neighbor for Text Summarization using Feature Similarity
    Jo, Taeho
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION, CONTROL, COMPUTING AND ELECTRONICS ENGINEERING (ICCCCEE), 2017,
  • [32] Exploring Structure Oriented Feature Tag Weighting Algorithm for Web Documents Identification
    Verma, Karunendra
    Srivastava, Prateek
    Chakrabarti, Prasun
    SOFT COMPUTING SYSTEMS, ICSCS 2018, 2018, 837 : 169 - 180
  • [33] Text Summarization Application for Indonesian Twitter Document by Using Top-N Feature Selection Algorithm
    Indra, Zul
    Jusman, Yessi
    Winarso, Doni
    2020 1ST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, ADVANCED MECHANICAL AND ELECTRICAL ENGINEERING (ICITAMEE 2020), 2020, : 238 - 243
  • [34] Term Weighting using Contextual Information for Categorization of Unstructured Text Documents
    Kulkarni, Anagha
    Tokekar, Vrinda
    Kulkarni, Parag
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [35] Transformer-based Cross-Lingual Summarization using Multilingual Word Embeddings for English - Bahasa Indonesia
    Abka, Achmad F.
    Azizah, Kurniawati
    Jatmiko, Wisnu
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 636 - 645
  • [36] Evolutionary Feature Selection for Text Documents using the SVM
    Morariu, Daniel I.
    Vintan, Lucian N.
    Tresp, Volker
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 15, 2006, 15 : 215 - +
  • [37] Binary Particle Swarm Optimization with an improved genetic algorithm to solve multi-document text summarization problem of Hindi documents
    Aote, Shailendra S.
    Pimpalshende, Anjusha
    Potnurwar, Archana
    Lohi, Shantanu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [38] Automatic text summarization with genetic algorithm-based attribute selection
    Silla, CN
    Pappa, GL
    Freitas, AA
    Kaestner, CAA
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2004, 2004, 3315 : 305 - 314
  • [39] Hierarchical Summarization of Text Documents Using Topic Modeling and Formal Concept Analysis
    Akhtar, Nadeem
    Javed, Hira
    Ahmad, Tameem
    DATA MANAGEMENT, ANALYTICS AND INNOVATION, ICDMAI 2018, VOL 2, 2019, 839 : 21 - 33
  • [40] Who Needs External References?-Text Summarization Evaluation Using Original Documents
    Al Foysal, Abdullah
    Boeck, Ronald
    AI, 2023, 4 (04) : 970 - 995