Classification of Closely Related Sub-dialects of Arabic Using Support-Vector Machines

被引:0
|
作者
Wray, Samantha [1 ]
机构
[1] New York Univ Abu Dhabi, Abu Dhabi, U Arab Emirates
基金
美国国家科学基金会;
关键词
text classification; validation of language resources; language identification;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Colloquial dialects of Arabic can be roughly categorized into five groups based on relatedness and geographic location (Egyptian, North African/Maghrebi, Gulf, Iraqi, and Levantine), but given that all dialects utilize much of the same writing system and share overlapping features and vocabulary, dialect identification and text classification is no trivial task. Furthermore, text classification by dialect is often performed at a coarse-grained level into these five groups or a subset thereof, and there is little work on sub-dialectal classification. The current study utilizes an n-gram based SVM to classify on a fine-grained sub-dialectal level, and compares it to methods used in dialect classification such as vocabulary pruning of shared items across dialects. A test case of the dialect Levantine is presented here, and results of 65% accuracy on a four-way classification experiment to sub-dialects of Levantine (Jordanian, Lebanese, Palestinian and Syrian) are presented and discussed. This paper also examines the possibility of leveraging existing mixed-dialectal resources to determine their sub-dialectal makeup by automatic classification.
引用
收藏
页码:3671 / 3674
页数:4
相关论文
共 50 条
  • [21] Classification of event related potentials of error- related observations using support vector machines
    Asvestas, Pantelis
    Ventouras, Errikos M.
    Karanasiou, Irene
    Matsopoulos, George K.
    Communications in Computer and Information Science, 2013, 384 : 40 - 49
  • [22] Classification of Event Related Potentials of Error-Related Observations Using Support Vector Machines
    Asvestas, Pantelis
    Ventouras, Errikos M.
    Karanasiou, Irene
    Matsopoulos, George K.
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT II, 2013, 384 : 40 - 49
  • [23] Classification of EEG Signals by using Support Vector Machines
    Bayram, K. Sercan
    Kizrak, M. Ayyuce
    Bolat, Bulent
    2013 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (IEEE INISTA), 2013,
  • [24] Terrain Mapping and Classification Using Support Vector Machines
    Hata, Alberto Yukinobu
    Wolf, Denis Fernando
    2009 6TH LATIN AMERICAN ROBOTICS SYMPOSIUM, 2009, : 20 - 25
  • [25] Classification of Endoscopic Images using Support Vector Machines
    Surangsrirat, Decho
    Tapia, Moiez A.
    Zhao, Weizhao
    IEEE SOUTHEASTCON 2010: ENERGIZING OUR FUTURE, 2010, : 436 - 439
  • [26] An Analysis of Methods for Tuning a Support-Vector Machine for Binary Classification
    Kadyrova N.O.
    Pavlova L.V.
    Biophysics, 2018, 63 (6) : 994 - 1003
  • [27] Nonstationary signal classification using support vector machines
    Gretton, A
    Davy, M
    Doucet, A
    Rayner, PJW
    2001 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING PROCEEDINGS, 2001, : 305 - 308
  • [28] Audio signal classification using support vector machines
    Chen, Lei-Ting
    Wang, Ming-Jen
    Wang, Chia-Jiu
    Tai, Heng-Ming
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 3972 : 188 - 193
  • [29] Online motion classification using support vector machines
    Cao, DW
    Masoud, OT
    Boley, D
    Papanikolopoulos, N
    2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2291 - 2296
  • [30] Classification of Nucleotide Sequences Using Support Vector Machines
    Seo, Tae-Kun
    JOURNAL OF MOLECULAR EVOLUTION, 2010, 71 (04) : 250 - 267