Ontology-Based Naive Bayes Short Text Classification Method for a Small Dataset

被引:0
|
作者
Sangounpao, Ketkaew [1 ]
Muenchaisri, Pornsiri [1 ]
机构
[1] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Bangkok, Thailand
关键词
requirements engineering; ontology; accounting domain knowledge; short text classification; small dataset; multi-classification; traditional classification;
D O I
10.1109/snpd.2019.8935711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Content less than two hundred words like comments or review statements is known as a short text. Short text classification is useful for automatically categorizing sentence into predefined group. There are several traditional short text classification methods by using bag-of-words with k nearest neighbors (k-NN), Naive Bayes, Maximum entropy, support vector machines (SVMs), and an algorithm based on statistics and rules. The deep learning method is outperformed other methods on classification of short text with normal size of dataset. Some researches classify requirements into functional and non-functional requirements. There is no research on multiclassification of functional requirements with a small dataset particularly for an accounting field. This paper presents an approach to classify short text for a small dataset into multiple categories of functional requirements on the accounting domain. The proposed approach uses an ontology to construct bag-of-words and uses Naive Bayes to classify for small dataset. The experiment is conducted using four hundred of datasets with 5-folds and 10-folds cross validation. The result shows that the method can correctly classify more than 80%. Additionally, comparisons between the ontology-based Naive Bayes method and other methods are investigated.
引用
收藏
页码:53 / 58
页数:6
相关论文
共 50 条
  • [1] Research on text classification mining based on Naive Bayes
    Liu, LZ
    Zhang, CL
    Chen, JJ
    [J]. ISTM/2005: 6TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-9, CONFERENCE PROCEEDINGS, 2005, : 8521 - 8524
  • [2] Research on Archives Text Classification Based on Naive Bayes
    Liu, Peixin
    Yu, Hongzhi
    Xu, Tao
    Lan, Chuanqo
    [J]. PROCEEDINGS OF 2017 IEEE 2ND INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2017, : 187 - 190
  • [3] Fast Text Classification with Naive Bayes Method on Apache Spark
    Ogul, Iskender Ulgen
    Ozcan, Caner
    Hakdagli, Ozlem
    [J]. 2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [4] Ontology-based Text Classification into Dynamically Defined Topics
    Allahyari, Mehdi
    Kochut, Krys J.
    Janik, Maciej
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 273 - 278
  • [5] Comparison of SVM and Ontology-Based Text Classification Methods
    Wrobel, Krzysztof
    Wielgosz, Maciej
    Smywinski-Pohl, Aleksander
    Pietron, Marcin
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2016, 2016, 9692 : 667 - 680
  • [6] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    [J]. CEIS 2011, 2011, 15
  • [7] Personality Classification based on Facebook status text using Multinomial Naive Bayes method
    Artissa, Y. B. N. D.
    Asror, I
    Faraby, S. A.
    [J]. 2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE, 2019, 1192
  • [8] Text Classification Based on Naive Bayes Algorithm with Feature Selection
    Chen, Zhenguo
    Shi, Guang
    Wang, Xiaoju
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (10): : 4255 - 4260
  • [9] A Chinese text classification system based on Naive Bayes algorithm
    Cui, Wei
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRONIC, INFORMATION AND COMPUTER ENGINEERING, 2016, 44
  • [10] A new method of chinese short text classification based on the domain ontology
    Yang, Fengqin
    Zhou, Xu
    Wu, Di
    Yang, Xiquan
    Sun, Tieli
    [J]. ICIC Express Letters, 2012, 6 (06): : 1399 - 1404