Information Extraction: Evaluating Named Entity Recognition from Classical Malay Documents

被引:0
|
作者
Sazali, Siti Syakirah [1 ]
Rahman, Nurazzah Abdul [1 ]
Abu Bakar, Zainab [2 ]
机构
[1] Univ Teknol MARA, Fac Comp & Math Sci, Shah Alam, Selangor, Malaysia
[2] Al Madinah Int Univ, Fac Comp & Informat Technol, Shah Alam, Selangor, Malaysia
关键词
component; bahasa melayu; information extraction; malay language; named entity recognition; natural language processing; nouns; nouns extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Natural Language Processing (NLP) is an important field of research in Computer Science. NLP is the process of analyzing texts based on a set of theories and technologies, and recent studies focused more on Information Extraction (IE). In Information Extraction, there are few steps or commonly known as task to be followed, which are named entity recognition, relation detection and classification, temporal and event processing, and template filling. Recent researches in Malay languages mainly focused on newspaper articles and since this research experiment is experimenting on classical documents, there is a need to identify the best way to extract noun from existing methods. This paper proposes to conduct a research about extracting nouns from Malay classical documents. The result shows that experiment using the Noun Extraction using Morphological Rules (Verb, Adjective and Noun Affixes) that has 77.61% chances of identifying a noun to contribute to the existing Malay noun list. As there is not any existing completed Malay noun list or dictionary that can be used as a guide, the results extracted still need to be judged by the language experts.
引用
收藏
页码:48 / 53
页数:6
相关论文
共 50 条
  • [11] Evaluation of Named Entity Recognition in Handwritten Documents
    Villanova-Aparisi, David
    Martinez-Hinarejos, Carlos-D
    Romero, Veronica
    Pastor-Gadea, Moises
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 568 - 582
  • [12] HMM-based Korean named entity recognition for information extraction
    Yun, Bo-Hyun
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 526 - 531
  • [13] A Survey of Named-Entity Recognition Methods for Food Information Extraction
    Popovski, Gorjan
    Seljak, Barbara Korousic
    Eftimov, Tome
    IEEE ACCESS, 2020, 8 : 31586 - 31594
  • [14] Named Entity Recognition and Relation Extraction with Graph Neural Networks in Semi Structured Documents
    Carbonell, Manuel
    Riba, Pau
    Villegas, Mauricio
    Fornes, Alicia
    Llados, Josep
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9622 - 9627
  • [15] Named Entity Recognition Approach for Malay Crime News Retrieval
    Saad, Saidah
    Mansor, Mohamed Kamil
    GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (04): : 216 - 235
  • [16] A Malay Named Entity Recognition Using Conditional Random Fields
    Salleh, Muhammad Sharilazlan
    Asmai, Siti Azirah
    Basiron, Halizah
    Ahmad, Sabrina
    2017 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOIC7), 2017,
  • [17] Named Entity Recognition and Classification in Historical Documents: A Survey
    Ehrmann, Maud
    Hamdi, Ahmed
    Pontes, Elvys Linhares
    Romanello, Matteo
    Doucet, Antoine
    ACM COMPUTING SURVEYS, 2024, 56 (02)
  • [18] Comparison of named entity recognition methodologies in biomedical documents
    Song, Hye-Jeong
    Jo, Byeong-Cheol
    Park, Chan-Young
    Kim, Jong-Dae
    Kim, Yu-Seop
    BIOMEDICAL ENGINEERING ONLINE, 2018, 17
  • [19] Named Entity Recognition in Unstructured Medical Text Documents
    Pearson, Cole
    Seliya, Naeem
    Dave, Rushit
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 412 - 417
  • [20] A Dataset of German Legal Documents for Named Entity Recognition
    Leitner, Elena
    Rehm, Georg
    Moreno-Schneider, Julian
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4478 - 4485