Using Bag-of-words to Distinguish Similar Languages: How Efficient are They?

被引:0
|
作者
Zampieri, Marcos [1 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a number of experiments describing the use of machine learning algorithms and bag-of-words to the task of automatic language identification. The paper focuses on the identification of language varieties, which is a known weakness of general purpose language identification methods. This question was addressed by a number of studies in the recent years, most of them relying on character n-gram language models. In this paper, I experiment simple bag-of-words and compare the results with previously proposed n-gram-based approaches. To perform these classification experiments three algorithms were used: Multinomial Naive Bayes (MNB), Support Vector Machines (SVM) and the J48 classifier.
引用
收藏
页码:37 / 41
页数:5
相关论文
共 50 条
  • [21] Contextual Bag-of-Words for Robust Visual Tracking
    Zeng, Fanxiang
    Ji, Yuefeng
    Levine, Martin D.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1433 - 1447
  • [22] Understanding bag-of-words model: a statistical framework
    Yin Zhang
    Rong Jin
    Zhi-Hua Zhou
    International Journal of Machine Learning and Cybernetics, 2010, 1 : 43 - 52
  • [23] Understanding bag-of-words model: a statistical framework
    Zhang, Yin
    Jin, Rong
    Zhou, Zhi-Hua
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2010, 1 (1-4) : 43 - 52
  • [24] A Bag-of-Words Speedometer for Single Camera SLAM
    Botterill, Tom
    Green, Richard
    Mills, Steven
    2009 24TH INTERNATIONAL CONFERENCE IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ 2009), 2009, : 91 - +
  • [25] Improving bag-of-words scheme for scene categorization
    Li, Qun
    Zhang, Hong-Gang
    Guo, Jun
    Bhanu, Bir
    An, Le
    Li, Q. (liqun@bupt.edu.cn), 1600, Beijing University of Posts and Telecommunications (19): : 166 - 171
  • [26] Scale Coding Bag-of-Words for Action Recognition
    Khan, Fahad Shahbaz
    van de Weijer, Joost
    Bagdanov, Andrew D.
    Felsberg, Michael
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1514 - 1519
  • [27] Persistence Bag-of-Words for Topological Data Analysis
    Zielinski, Bartosz
    Lipinski, Michal
    Juda, Mateusz
    Zeppelzauer, Matthias
    Dlotko, Pawel
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4489 - 4495
  • [28] Graph-based bag-of-words for classification
    Silva, Fernanda B.
    Werneck, Rafael de O.
    Goldenstein, Siome
    Tabbone, Salvatore
    Torres, Ricardo da S.
    PATTERN RECOGNITION, 2018, 74 : 266 - 285
  • [29] Incorporating Temporal Context in Bag-of-Words Models
    Glaser, Tamar
    Zelnik-Manor, Lihi
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [30] Multi-Document Summarization using Distributed Bag-of-Words Model
    Mani, Kaustubh
    Verma, Ishan
    Meisheri, Hardik
    Dey, Lipika
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 672 - 675