Using Annotation Projection for Semantic Role Labeling of Low-Resourced Language: Sinhala

被引:0
|
作者
Gunasekara, Sandun [1 ]
Chathura, Dulanjaya [1 ]
Jeewantha, Chamoda [1 ]
Dias, Gihan [1 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Moratuwa, Sri Lanka
关键词
SRL; Semantics; Semantic Role Labeling; Sinhala; Annotation; Projection; Labeller; Roles;
D O I
10.1109/ialp51396.2020.9310468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present SinSRL, the first-ever semantic role labeller (SRL) for Sinhala, an Indo-European language spoken mainly in Sri Lanka. SinSRL takes parallel text in English (or any other language for which a suitable SRL exists) and Sinhala and outputs semantically annotated Sinhala text. We have enhanced existing tools to address several issues related to the target language. This will also be useful for labeling other Indic languages. In addition, we have manually semantically labeled a small Sinhala-English parallel dataset. The accuracy of our system is similar to that of manually labeled data. Our implementation can be used to generate a SRL dataset which may be used to train a direct semantic role labeller. SinSRL may be easily modified to annotate other low-resource languages for which parallel corpora are available.
引用
收藏
页码:98 / 103
页数:6
相关论文
共 50 条
  • [1] An Automatic Summarizer for a Low-Resourced Language
    Pattnaik, Sagarika
    Nayak, Ajit Kumar
    ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 285 - 295
  • [2] Multilingual Neural Semantic Parsing for Low-Resourced Languages
    Xia, Menglin
    Monti, Emilio
    10TH CONFERENCE ON LEXICAL AND COMPUTATIONAL SEMANTICS (SEM 2021), 2021, : 185 - 194
  • [3] Performance of Recent Large Language Models for a Low-Resourced Language
    Jayakody, Ravindu
    Dias, Gihan
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 162 - 167
  • [4] A Spell Checker for a Low-resourced and Morphologically Rich Language
    Octaviano, Manolito, Jr.
    Borra, Allan
    TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE, 2017, : 1853 - 1856
  • [5] Gramatika: A Grammar Checker for the Low-Resourced Filipino Language
    Go, Matthew Phillip
    Nocon, Nicco
    Borra, Allan
    TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE, 2017, : 471 - 475
  • [6] A Need Finding Study with Low-Resourced Language Content Creators
    Nigatu, Hellina Hailu
    Canny, John
    Chasins, Sarah
    PROCEEDINGS OF THE 4TH AFRICAN CONFERENCE FOR HUMAN COMPUTER INTERACTION, AFRICHI 2023, 2023, : 1 - 4
  • [7] A First LVCSR System for Luxembourgish, a Low-Resourced European Language
    Adda-Decker, Martine
    Lamel, Lori
    Adda, Gilles
    Lavergne, Thomas
    HUMAN LANGUAGE TECHNOLOGY CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2014, 8387 : 479 - 490
  • [8] Text Classification of News Articles Using Machine Learning on Low-resourced Language: Tigrigna
    Fesseha, Awet
    Xiong, Shengwu
    Emiru, Eshete Derb
    Dahou, Abdelghani
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 34 - 38
  • [9] Focusing Annotation for Semantic Role Labeling
    Peterson, Daniel
    Palmer, Martha
    Wu, Shumin
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 4467 - 4471
  • [10] Common latent representation learning for low-resourced spoken language identification
    Chen, Chen
    Bu, Yulin
    Chen, Yong
    Chen, Deyun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (12) : 34515 - 34535