A classification-based approach to the identification of Multiword Expressions (MWEs) in Magahi Applying SVM

被引:4
|
作者
Kumar, Shivek [1 ]
Behera, Pitambar [1 ]
Jha, Girish Nath [1 ]
机构
[1] Jawaharlal Nehru Univ, Ctr Linguist, New Delhi, India
来源
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS | 2017年 / 112卷
关键词
Multiword expressions; SVM; Magahi; Indo-Aryan languages; less resourced languages;
D O I
10.1016/j.procs.2017.08.059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiword Expressions are crucial for any Natural Language Processing task as they frequently occur in any natural language. In addition, they "display a continuum of compositionality". Although they have much frequency in informal spoken corpus, they are used less frequently in formal textual corpus. Multiword expressions in Magahi can provide a unique platform and a gateway to research into other less-resourced Indian languages in general and dialectal varieties of Hindi in particular. This is the very first research project of its kind undertaken in Magahi. In this study, we have applied Support Vector Machines classifier for automatic identification and classification of multiword expressions. For this purpose, we have applied a POS-annotated corpus of approximately 75k word tokens out of which 11k tokens are multiword expressions. The raw data applied in this study have been crawled and sanitized by Indian languages crawler known as IC Crawler and semi-automatically annotated by the ILCI annotation tool. The tagset adhered for annotation comprises of nine annotation labels as adapted from Singh et al. The Magahi multiword extractor achieves a combined overall precision accuracy of 81.57%. (C) 2017 The Authors. Published by Elsevier B.V.
引用
收藏
页码:594 / 603
页数:10
相关论文
共 50 条
  • [21] AN SVM BASED CLASSIFICATION APPROACH TO SPEECH SEPARATION
    Han, Kun
    Wang, DeLiang
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4632 - 4635
  • [22] Classification-Based Approach for Hybridizing Statistical and Rule-Based Machine Translation
    Park, Eun-Jin
    Kwon, Oh-Woog
    Kim, Kangil
    Kim, Young-Kil
    ETRI JOURNAL, 2015, 37 (03) : 541 - 550
  • [23] Hierarchical Classification-Based Region Growing (HCBRG): A Collaborative Approach for Object Segmentation and Classification
    Sellaouti, Aymen
    Hamouda, Atef
    Deruyver, Aline
    Wemmert, Cedric
    IMAGE ANALYSIS AND RECOGNITION, PT I, 2012, 7324 : 51 - 60
  • [24] Review of Methods for EEG Signal Classification and Development of New Fuzzy Classification-Based Approach
    Rabcan, Jan
    Levashenko, Vitaly
    Zaitseva, Elena
    Kvassay, Miroslav
    IEEE ACCESS, 2020, 8 : 189720 - 189734
  • [25] Rice Identification and Classification System Based on SVM Algorithm
    Xu, Jiaxin
    Wang, Xuetian
    Gao, Hongmin
    Zhai, Ziming
    Li, Runchao
    2022 INTERNATIONAL CONFERENCE ON MICROWAVE AND MILLIMETER WAVE TECHNOLOGY (ICMMT), 2022,
  • [26] Applying RBF neural networks to cancer classification based on gene expressions
    Chu, Feng
    Wang, Lipo
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1930 - +
  • [27] Detecting profile injection attacks in collaborative filtering: A classification-based approach
    Williams, Chad A.
    Mobasher, Bamshad
    Burke, Robin
    Bhaumik, Runa
    ADVANCES IN WEB MINING AND WEB USAGE ANALYSIS, 2007, 4811 : 167 - +
  • [28] An Ensemble Classification-Based Approach Applied to Retinal Blood Vessel Segmentation
    Fraz, Muhammad Moazam
    Remagnino, Paolo
    Hoppe, Andreas
    Uyyanonvara, Bunyarit
    Rudnicka, Alicja R.
    Owen, Christopher G.
    Barman, Sarah A.
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2012, 59 (09) : 2538 - 2548
  • [29] A classification-based approach to semi-supervised clustering with pairwise constraints
    Smieja, Marek
    Struski, Lukasz
    Figueiredo, Mario A. T.
    NEURAL NETWORKS, 2020, 127 : 193 - 203
  • [30] An ensemble classification-based approach to detect attack level of SQL injections
    Kasim, Omer
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2021, 59