Question classification based on Bloom's taxonomy cognitive domain using modified TF-IDF and word2vec

被引:44
|
作者
Mohammed, Manal [1 ,2 ]
Omar, Nazlia [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, CAIT, Bangi, Selangor, Malaysia
[2] Hadhramout Univ, Fac Adm Sci, Management Informat Syst Dept, Al Mukalla, Yemen
来源
PLOS ONE | 2020年 / 15卷 / 03期
关键词
D O I
10.1371/journal.pone.0230442
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The assessment of examination questions is crucial in educational institutes since examination is one of the most common methods to evaluate students' achievement in specific course. Therefore, there is a crucial need to construct a balanced and high-quality exam, which satisfies different cognitive levels. Thus, many lecturers rely on Bloom's taxonomy cognitive domain, which is a popular framework developed for the purpose of assessing students' intellectual abilities and skills. Several works have been proposed to automatically handle the classification of questions in accordance with Bloom's taxonomy. Most of these works classify questions according to specific domain. As a result, there is a lack of technique of classifying questions that belong to the multi-domain areas. The aim of this paper is to present a classification model to classify exam questions based on Bloom's taxonomy that belong to several areas. This study proposes a method for classifying questions automatically, by extracting two features, TFPOS-IDF and word2vec. The purpose of the first feature was to calculate the term frequency-inverse document frequency based on part of speech, in order to assign a suitable weight for essential words in the question. The second feature, pre-trained word2vec, was used to boost the classification process. Then, the combination of these features was fed into three different classifiers; K-Nearest Neighbour, Logistic Regression, and Support Vector Machine, in order to classify the questions. The experiments used two datasets. The first dataset contained 141 questions, while the other dataset contained 600 questions. The classification result for the first dataset achieved an average of 71.1%, 82.3% and 83.7% weighted F1-measure respectively. The classification result for the second dataset achieved an average of 85.4%, 89.4% and 89.7% weighted F1-measure respectively. The finding from this study showed that the proposed method is significant in classifying questions from multiple domains based on Bloom's taxonomy.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Research on Chinese Text Classification Based on Word2vec
    Yang, Zhi-Tong
    Zheng, Jun
    [J]. 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1166 - 1170
  • [22] Chinese Sentiment Classification Using Extended Word2Vec
    张胜
    张鑫
    程佳军
    王晖
    [J]. Journal of Donghua University(English Edition), 2016, 33 (05) : 823 - 826
  • [23] Microblogging Short Text Classification based on Word2Vec
    Zhang, Yonghui
    Liu, Jingang
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ELECTRONIC, MECHANICAL, INFORMATION AND MANAGEMENT SOCIETY (EMIM), 2016, 40 : 395 - 401
  • [24] Short Text Classification Based on Wikipedia and Word2vec
    Liu Wensen
    Cao Zewen
    Wang Jun
    Wang Xiaoyi
    [J]. 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1195 - 1200
  • [25] KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC
    Ogul, Iskender Ulgen
    Ozcan, Caner
    Hakdagli, Ozlem
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [26] Text Classification Based on Word2vec and Convolutional Neural Network
    Li, Lin
    Xiao, Linlong
    Jin, Wenzhen
    Zhu, Hong
    Yang, Guocai
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 450 - 460
  • [27] Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec
    Kim, Donghwa
    Seo, Deokseong
    Cho, Suhyoun
    Kang, Pilsung
    [J]. INFORMATION SCIENCES, 2019, 477 : 15 - 29
  • [28] Turkish Document Classification Based on Word2Vec and SVM Classifier
    Sahin, Gurkan
    [J]. 2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [29] Text Classification Research Based on Improved Word2vec and CNN
    Gao, Mengyuan
    Li, Tinghui
    Huang, Peifang
    [J]. SERVICE-ORIENTED COMPUTING, ICSOC 2018, 2019, 11434 : 126 - 135
  • [30] Diet Health Text Classification Based on word2vec and LSTM
    Zhao, Ming
    Du, Huifang
    Dong, Cuicui
    Chen, Changsong
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2017, 48 (10): : 202 - 208