Using Bag-of-words to Distinguish Similar Languages: How Efficient are They?

被引:0
|
作者
Zampieri, Marcos [1 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a number of experiments describing the use of machine learning algorithms and bag-of-words to the task of automatic language identification. The paper focuses on the identification of language varieties, which is a known weakness of general purpose language identification methods. This question was addressed by a number of studies in the recent years, most of them relying on character n-gram language models. In this paper, I experiment simple bag-of-words and compare the results with previously proposed n-gram-based approaches. To perform these classification experiments three algorithms were used: Multinomial Naive Bayes (MNB), Support Vector Machines (SVM) and the J48 classifier.
引用
收藏
页码:37 / 41
页数:5
相关论文
共 50 条
  • [31] Brain MRI classification using Discrete Wavelet Transform and Bag-of-words
    Ayadi, Wadhah
    Elhamzi, Wajdi
    Charfi, Imen
    Ouni, Bouraoui
    Atri, Mohamed
    2018 INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND ELECTRICAL TECHNOLOGIES (IC_ASET), 2017, : 45 - 49
  • [32] A Bag-of-Words Model for Cellular Image Segmentation
    Cheng, Li
    Ye, Ning
    Yu, Weimiao
    Cheah, Andre
    ADVANCES IN BIO-IMAGING: FROM PHYSICS TO SIGNAL UNDERSTANDING ISSUES: STATE OF THE ART AND CHALLEGES, 2012, 120 : 209 - +
  • [33] Bag-of-Words Baselines for Semantic Code Search
    Zhang, Xinyu
    Xin, Ji
    Yates, Andrew
    Lin, Jimmy
    NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 88 - 94
  • [34] Fuzzy Bag-of-Words Model for Document Representation
    Zhao, Rui
    Mao, Kezhi
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (02) : 794 - 804
  • [35] Vehicle Logo Recognition Based on Bag-of-Words
    Yu, Shuyuan
    Zheng, Shibao
    Yang, Hua
    Liang, Longfei
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2013), 2013, : 353 - 358
  • [36] Detection and Classification of Diabetic Retinopathy Anomalies Using Bag-of-Words Model
    Mukti, Fanji Ari
    Eswaran, Chikannan
    Hashim, Noranniza
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2015, 5 (05) : 1009 - 1019
  • [37] Texture Classification Using Scale Invariant Feature Transform and Bag-of-Words
    Budak, Umit
    Sengur, Abdulkadir
    2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 152 - 155
  • [38] Laser-based Segment Classification Using a Mixture of Bag-of-Words
    Behley, Jens
    Steinhage, Volker
    Cremers, Armin B.
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 4195 - 4200
  • [39] Learning Bag-of-Words Models Using Sparse Partial Least Squares
    Liu, Jingneng
    Zeng, Guihua
    FOUNDATIONS OF INTELLIGENT SYSTEMS (ISKE 2011), 2011, 122 : 445 - 455
  • [40] Forearm Electromyogram-Based Biometrics Using Bag-of-Words Classifiers
    Pavel, Irina
    Ciocoiu, Iulian B.
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 716 - 720