Information Extraction: Evaluating Named Entity Recognition from Classical Malay Documents

被引:0
|
作者
Sazali, Siti Syakirah [1 ]
Rahman, Nurazzah Abdul [1 ]
Abu Bakar, Zainab [2 ]
机构
[1] Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam, Selangor, Malaysia
[2] Al Madinah Int Univ, Fac Comp & Informat Technol, Shah Alam, Selangor, Malaysia
关键词
component; bahasa melayu; information extraction; malay language; named entity recognition; natural language processing; nouns; nouns extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Natural Language Processing (NLP) is an important field of research in Computer Science. NLP is the process of analyzing texts based on a set of theories and technologies, and recent studies focused more on Information Extraction (IE). In Information Extraction, there are few steps or commonly known as task to be followed, which are named entity recognition, relation detection and classification, temporal and event processing, and template filling. Recent researches in Malay languages mainly focused on newspaper articles and since this research experiment is experimenting on classical documents, there is a need to identify the best way to extract noun from existing methods. This paper proposes to conduct a research about extracting nouns from Malay classical documents. The result shows that experiment using the Noun Extraction using Morphological Rules (Verb, Adjective and Noun Affixes) that has 77.61% chances of identifying a noun to contribute to the existing Malay noun list. As there is not any existing completed Malay noun list or dictionary that can be used as a guide, the results extracted still need to be judged by the language experts.
引用
收藏
页码:48 / 53
页数:6
相关论文
共 50 条
  • [21] Comparison of named entity recognition methodologies in biomedical documents
    Hye-Jeong Song
    Byeong-Cheol Jo
    Chan-Young Park
    Jong-Dae Kim
    Yu-Seop Kim
    BioMedical Engineering OnLine, 17
  • [22] Chinese Data Extraction and Named Entity Recognition
    Yang, Tingwei
    Jiang, Daguang
    Shi, Shenghui
    Than, Siyan
    Zhuo, Lin
    Yin, Yukang
    Liang, Zheng
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020), 2020, : 105 - 109
  • [23] An Approach to Named Entity Extraction from Historical Documents in Traditional Mongolian Script
    Batjargal, Biligsaikhan
    Khaltarkhuu, Garmaabazar
    Kimura, Fuminori
    Maeda, Akira
    2014 IEEE/ACM JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL), 2014, : 489 - 490
  • [24] Named Entity Recognition in Classical Chinese by Lexicon Enhancement
    Yu, Jianye
    Feng, Xiangyilan
    Li, Jie
    Liu, Jialin
    2023 IEEE INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WI-IAT, 2023, : 463 - 468
  • [25] "FabNER": information extraction from manufacturing process science domain literature using named entity recognition
    Kumar, Aman
    Starly, Binil
    JOURNAL OF INTELLIGENT MANUFACTURING, 2022, 33 (08) : 2393 - 2407
  • [26] “FabNER”: information extraction from manufacturing process science domain literature using named entity recognition
    Aman Kumar
    Binil Starly
    Journal of Intelligent Manufacturing, 2022, 33 : 2393 - 2407
  • [27] An Analysis of the Performance of Named Entity Recognition over OCRed Documents
    Hamdi, Ahmed
    Jean-Caurant, Axel
    Sidere, Nicolas
    Coustaty, Mickael
    Doucet, Antoine
    2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019), 2019, : 333 - 334
  • [28] Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents
    Francis, Sumam
    Van Landeghem, Jordy
    Moens, Marie-Francine
    INFORMATION, 2019, 10 (08)
  • [29] Named Entity Recognition for Improving Retrieval and Translation of Chinese Documents
    Srihari, Rohini K.
    Peterson, Erik
    DIGITAL LIBRARIES: UNIVERSAL AND UBIQUITOUS ACCESS TO INFORMATION, PROCEEDINGS, 2008, 5362 : 404 - +
  • [30] Named Entity Recognition of Spoken Documents using Subword Units
    Paass, Gerhard
    Pilz, Anja
    Schwenninger, Jochen
    2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 529 - 534