Classification of Closely Related Sub-dialects of Arabic Using Support-Vector Machines

被引:0
|
作者
Wray, Samantha [1 ]
机构
[1] New York Univ Abu Dhabi, Abu Dhabi, U Arab Emirates
基金
美国国家科学基金会;
关键词
text classification; validation of language resources; language identification;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Colloquial dialects of Arabic can be roughly categorized into five groups based on relatedness and geographic location (Egyptian, North African/Maghrebi, Gulf, Iraqi, and Levantine), but given that all dialects utilize much of the same writing system and share overlapping features and vocabulary, dialect identification and text classification is no trivial task. Furthermore, text classification by dialect is often performed at a coarse-grained level into these five groups or a subset thereof, and there is little work on sub-dialectal classification. The current study utilizes an n-gram based SVM to classify on a fine-grained sub-dialectal level, and compares it to methods used in dialect classification such as vocabulary pruning of shared items across dialects. A test case of the dialect Levantine is presented here, and results of 65% accuracy on a four-way classification experiment to sub-dialects of Levantine (Jordanian, Lebanese, Palestinian and Syrian) are presented and discussed. This paper also examines the possibility of leveraging existing mixed-dialectal resources to determine their sub-dialectal makeup by automatic classification.
引用
收藏
页码:3671 / 3674
页数:4
相关论文
共 50 条
  • [11] A CBIR CLASSIFICATION USING SUPPORT VECTOR MACHINES
    Sugamya, Katta
    Pabboju, Suresh
    Babu, A. Vinaya
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN HUMAN MACHINE INTERACTION (HMI), 2016, : 135 - +
  • [12] Classification of Torreya Using Support Vector Machines
    Wang, Xiaodong
    Chang, Jianli
    2012 THIRD INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND INFORMATION (TEIN 2012), 2012, : 212 - 216
  • [13] Cloud classification using support vector machines
    Azimi-Sadjadi, MR
    Zekavat, SA
    IGARSS 2000: IEEE 2000 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOL I - VI, PROCEEDINGS, 2000, : 669 - 671
  • [14] Gender classification using support vector machines
    Yang, MH
    Moghaddam, B
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 471 - 474
  • [15] Classification of Performers using Support Vector Machines
    Reljin, Natasa
    Pokrajac, Dragoljub
    NEUREL 2008: NINTH SYMPOSIUM ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2008, : 156 - +
  • [16] Pose classification using support vector machines
    Ardizzone, E
    Chella, A
    Pirrone, R
    IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 317 - 322
  • [17] Scene Classification Using Support Vector Machines
    Mandhala, Venkata Naresh
    Sujatha, V.
    Devi, B. Renuka
    2014 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2014, : 1807 - 1810
  • [18] Texture classification using the support vector machines
    Li, ST
    Kwok, JT
    Zhu, HL
    Wang, YN
    PATTERN RECOGNITION, 2003, 36 (12) : 2883 - 2893
  • [19] Accent classification using support vector machines
    Pedersen, Carol
    Diederich, Joachim
    6TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE, PROCEEDINGS, 2007, : 444 - +
  • [20] Effective arabic character recognition using support vector machines
    Abd, Mehmmood Abdulla
    Paschos, George
    INNOVATIONS AND ADVANCED TECHNIQUES IN COMPUTER AND INFORMATION SCIENCES AND ENGINEERING, 2007, : 7 - 11