Determining Features of News Headline in Malay News Document

被引:2
|
作者
Noah, Shahrul Azman Mohd [1 ]
Ali, Nazlena Mohamad [1 ]
Hasan, Mohd Sabri [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fak Teknol & Sains Maklumat, Bangi, Malaysia
来源
关键词
headline; Natural Language Processing; malay news; text summarization; Malay corpus;
D O I
10.17576/gema-2018-1802-11
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Headline summarization is one of the automated text summarization techniques that can reduce the problem of information overload in the retrieval system and reduce the user's cognitive burden while searching and selecting relevant documents in large quantities. This study discusses the process on the determination of Malay language system features in the news genre document. Methodology starts with analysis the corpus of Malay news documents. The corpus contains 140 core news items which were selected from the two mainstream news databases in Malaysia which are Berita Harian and Utusan Malaysia. The selection news criteria are from core news categories, sized 50 to 250 words, the years of publication from 2007 to 2012 and news genres from economic, crime, education and sports. Three linguistic experts in Malay produced a headline summary for each news document manually. The experts need to comply with three conditions which are summary extraction, select-word-inorder word selection techniques and word morphological changes. The experimental results show that three characteristics have been identified, first: the first two sentenses are the important sentences, second: the verse that contains the potential acronym definitions is chosen as the most important sentence and third: the size of the summary of the ideal headline is six words. The consideration of this feature allows a summary of the headline that can be generated automatically, just like the process done by human.
引用
收藏
页码:154 / 167
页数:14
相关论文
共 50 条
  • [31] Headline Format Influences Evaluation of, but Not Engagement with, Environmental News
    Janet, Kristina
    Richards, Othello
    Landrum, Asheley R.
    JOURNALISM PRACTICE, 2022, 16 (01) : 35 - 55
  • [32] Fact-Preserved Personalized News Headline Generation
    Yang, Zhao
    Lian, Junhong
    Ao, Xiang
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1493 - 1498
  • [33] Unleashing the Potential of Attention Model for News Headline Generation
    Liao, Yong
    Meng, Kui
    Zhang, Jianshen
    Liu, Gongshen
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [34] Shironaam: Bengali News Headline Generation using Auxiliary Information
    Akash, Abu Ubaida
    Nayeem, Mir Tafseer
    Shohan, Faisal Tareque
    Islam, Tanvir
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 52 - 67
  • [35] Roman Urdu News Headline Classification Empowered with Machine Learning
    Naqvi, Rizwan Ali
    Khan, Muhammad Adnan
    Malik, Nauman
    Saqib, Shazia
    Alyas, Tahir
    Hussain, Dildar
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 65 (02): : 1221 - 1236
  • [36] THE DEVELOPMENT AND ANALYSIS OF A MALAY BROADCASR NEWS CORPUS
    Chong, Tze Yuang
    Xiao, Xiong
    Xu, Haihua
    Tan, Tien-Ping
    Chau-Khoa, Pham
    Lyu, Dau-Cheng
    Chng, Eng Siong
    Li, Haizhou
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [37] A SYSTEM COMBINATION FOR MALAY BROADCAST NEWS TRANSCRIPTION
    Khalaf, Zainab A.
    Tan, Tien-Ping
    Wong, Li-Pei
    Ahmed, Basem H. A.
    JURNAL TEKNOLOGI, 2015, 77 (19): : 35 - 44
  • [38] PENS: A Dataset and Generic Framework for Personalized News Headline Generation
    Ao, Xiang
    Wang, Xiting
    Luo, Ling
    Qiao, Ying
    He, Qing
    Xie, Xing
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 82 - 92
  • [39] A Semantic Representation Enhancement Method for Chinese News Headline Classification
    Yin, Zhongbo
    Tang, Jintao
    Ru, Chengsen
    Luo, Wei
    Luo, Zhunchen
    Ma, Xiaolei
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 318 - 328
  • [40] Multimodal news document summarization
    Javed, Hira
    Akhtar, Nadeem
    Beg, M. M. Sufyan
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2024, 45 (04): : 959 - 968