Using Bag-of-words to Distinguish Similar Languages: How Efficient are They?

被引:0
|
作者
Zampieri, Marcos [1 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a number of experiments describing the use of machine learning algorithms and bag-of-words to the task of automatic language identification. The paper focuses on the identification of language varieties, which is a known weakness of general purpose language identification methods. This question was addressed by a number of studies in the recent years, most of them relying on character n-gram language models. In this paper, I experiment simple bag-of-words and compare the results with previously proposed n-gram-based approaches. To perform these classification experiments three algorithms were used: Multinomial Naive Bayes (MNB), Support Vector Machines (SVM) and the J48 classifier.
引用
收藏
页码:37 / 41
页数:5
相关论文
共 50 条
  • [1] Distinguish Polarity in Bag-of-Words Visualization
    Xie, Yusheng
    Chen, Zhengzhang
    Agrawal, Ankit
    Choudhary, Alok
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3344 - 3350
  • [2] ECG Biometrics Using Bag-of-Words Models
    Ciocoiu, Iulian B.
    2015 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2015,
  • [3] The Bag-of-Words Methods with Pareto-Fronts for Similar Image Retrieval
    Gabryel, Marcin
    INFORMATION AND SOFTWARE TECHNOLOGIES (ICIST 2017), 2017, 756 : 374 - 384
  • [4] Accelerating Bag-of-Words with SOM
    Chen, Jian-Hui
    Wang, Zuo-Ren
    Liu, Cheng-Lin
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT III, 2019, 11955 : 573 - 584
  • [5] How to use Bag-of-Words model better for image classification
    Wang, Chong
    Huang, Kaiqi
    IMAGE AND VISION COMPUTING, 2015, 38 : 65 - 74
  • [6] Language identification: How to distinguish similar languages?
    Ljubesic, Nikola
    Mikelic, Nives
    Boras, Damir
    PROCEEDINGS OF THE ITI 2007 29TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2007, : 541 - +
  • [7] Global and local exploitation for saliency using bag-of-words
    Zheng, Zhenzhu
    Zhang, Yun
    Yan, Luxin
    IET COMPUTER VISION, 2014, 8 (04) : 299 - 304
  • [8] Bag-of-words Modelling for Speech Recognition
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 646 - +
  • [9] Contextual Bag-of-Words for Visual Categorization
    Li, Teng
    Mei, Tao
    Kweon, In-So
    Hua, Xian-Sheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (04) : 381 - 392
  • [10] Bag-of-Words Similarity in eXplainable AI
    Narteni, Sara
    Ferretti, Melissa
    Rampa, Vittorio
    Mongelli, Maurizio
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, 2023, 543 : 835 - 851