Determining Features of News Headline in Malay News Document

被引:2
|
作者
Noah, Shahrul Azman Mohd [1 ]
Ali, Nazlena Mohamad [1 ]
Hasan, Mohd Sabri [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fak Teknol & Sains Maklumat, Bangi, Malaysia
来源
关键词
headline; Natural Language Processing; malay news; text summarization; Malay corpus;
D O I
10.17576/gema-2018-1802-11
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
Headline summarization is one of the automated text summarization techniques that can reduce the problem of information overload in the retrieval system and reduce the user's cognitive burden while searching and selecting relevant documents in large quantities. This study discusses the process on the determination of Malay language system features in the news genre document. Methodology starts with analysis the corpus of Malay news documents. The corpus contains 140 core news items which were selected from the two mainstream news databases in Malaysia which are Berita Harian and Utusan Malaysia. The selection news criteria are from core news categories, sized 50 to 250 words, the years of publication from 2007 to 2012 and news genres from economic, crime, education and sports. Three linguistic experts in Malay produced a headline summary for each news document manually. The experts need to comply with three conditions which are summary extraction, select-word-inorder word selection techniques and word morphological changes. The experimental results show that three characteristics have been identified, first: the first two sentenses are the important sentences, second: the verse that contains the potential acronym definitions is chosen as the most important sentence and third: the size of the summary of the ideal headline is six words. The consideration of this feature allows a summary of the headline that can be generated automatically, just like the process done by human.
引用
收藏
页码:154 / 167
页数:14
相关论文
共 50 条
  • [1] Generation of News Headline for Malay Language based on Term Features
    Noah, Shahrul Azman Mohd
    Ali, Nazlena Mohamad
    Hasan, Mohd Sabri
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (04): : 42 - 59
  • [2] HEADLINE NEWS
    KLINGSBERG, E
    CHEMISTRY & INDUSTRY, 1995, (16) : 634 - 634
  • [3] HEADLINE NEWS
    LEES, D
    NEW SCIENTIST, 1985, 108 (1487) : 86 - 86
  • [4] No longer headline news
    Royal Bank of Scotland Group's, United Kingdom
    Machinery, 2006, 4122 (80):
  • [5] News Headline as a Form of News Text Compression
    Kochetkova, Nataliya
    Pronoza, Ekaterina
    Yagunova, Elena
    SOCIAL INFORMATICS (SOCINFO 2018), PT II, 2018, 11186 : 139 - 147
  • [6] Construction of News Headline from Detailed News Article
    Shrawankar, Urmila
    Wankhede, Kranti
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2321 - 2325
  • [7] Fat and diet headline the news
    Brooks, Ellen
    Food Technology, 1994, 48 (11)
  • [8] Osteoporosis 2002: Headline news
    Watts, N
    SOUTHERN MEDICAL JOURNAL, 2002, 95 (06) : 570 - 577
  • [9] An ontology for representing financial headline news
    Mellouli, Sehl
    Bouslama, Faouzi
    Akande, Aichath
    JOURNAL OF WEB SEMANTICS, 2010, 8 (2-3): : 203 - 208
  • [10] Automatic Headline Generation for News Article
    Rajalakshmy, K. R.
    Remya, P. C.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 1, CIDM 2015, 2016, 410 : 357 - 366