Question classification based on Bloom's taxonomy cognitive domain using modified TF-IDF and word2vec

被引:45
|
作者
Mohammed, Manal [1 ,2 ]
Omar, Nazlia [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, CAIT, Bangi, Selangor, Malaysia
[2] Hadhramout Univ, Fac Adm Sci, Management Informat Syst Dept, Al Mukalla, Yemen
来源
PLOS ONE | 2020年 / 15卷 / 03期
关键词
D O I
10.1371/journal.pone.0230442
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The assessment of examination questions is crucial in educational institutes since examination is one of the most common methods to evaluate students' achievement in specific course. Therefore, there is a crucial need to construct a balanced and high-quality exam, which satisfies different cognitive levels. Thus, many lecturers rely on Bloom's taxonomy cognitive domain, which is a popular framework developed for the purpose of assessing students' intellectual abilities and skills. Several works have been proposed to automatically handle the classification of questions in accordance with Bloom's taxonomy. Most of these works classify questions according to specific domain. As a result, there is a lack of technique of classifying questions that belong to the multi-domain areas. The aim of this paper is to present a classification model to classify exam questions based on Bloom's taxonomy that belong to several areas. This study proposes a method for classifying questions automatically, by extracting two features, TFPOS-IDF and word2vec. The purpose of the first feature was to calculate the term frequency-inverse document frequency based on part of speech, in order to assign a suitable weight for essential words in the question. The second feature, pre-trained word2vec, was used to boost the classification process. Then, the combination of these features was fed into three different classifiers; K-Nearest Neighbour, Logistic Regression, and Support Vector Machine, in order to classify the questions. The experiments used two datasets. The first dataset contained 141 questions, while the other dataset contained 600 questions. The classification result for the first dataset achieved an average of 71.1%, 82.3% and 83.7% weighted F1-measure respectively. The classification result for the second dataset achieved an average of 85.4%, 89.4% and 89.7% weighted F1-measure respectively. The finding from this study showed that the proposed method is significant in classifying questions from multiple domains based on Bloom's taxonomy.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] A study of damp-heat syndrome classification Using Word2vec and TF-IDF
    Zhu, Wei
    Zhang, Wei
    Li, Guo-Zheng
    He, Chong
    Zhang, Lei
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1415 - 1420
  • [2] 基于TF-IDF与Word2vec的新闻热点分析
    王婧
    [J]. 中国有线电视, 2023, (02) : 59 - 63
  • [3] Text Mining Approach Using TF-IDF and Naive Bayes for Classification of Exam Questions Based on Cognitive Level of Bloom's Taxonomy
    Aninditya, Annisa
    Hasibuan, Muhammad Azani
    Sutoyo, Edi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND INTELLIGENCE SYSTEM (IOTAIS), 2019, : 112 - 117
  • [4] 基于TF-IDF与Word2vec的用户评论分析研究
    刘宇韬
    施莉
    刘诗含
    [J]. 成都航空职业技术学院学报, 2022, 38 (04) : 89 - 92
  • [5] 基于TF-IDF与word2vec的台词文本分类研究
    但宇豪
    黄继风
    杨琳
    高海
    [J]. 上海师范大学学报(自然科学版), 2020, 49 (自然科学版) - 95
  • [6] 基于TF-IDF与word2vec的台词文本分类研究
    但宇豪
    黄继风
    杨琳
    高海
    [J]. 上海师范大学学报(自然科学版), 2020, 49 (01) : 89 - 95
  • [7] Comparative Analysis of Machine Learning Algorithms for Email Phishing Detection Using TF-IDF, Word2Vec, and BERT
    Al Tawil, Arar
    Almazaydeh, Laiali
    Qawasmeh, Doaa
    Qawasmeh, Baraah
    Alshinwan, Mohammad
    Elleithy, Khaled
    [J]. Computers, Materials and Continua, 2024, 81 (02): : 3395 - 3412
  • [8] 基于Word2vec和改进TF-IDF算法的深度学习模型研究
    石琳
    徐瑞龙
    [J]. 计算机与数字工程, 2021, 49 (05) : 966 - 970
  • [9] 基于TF-IDF和word2Vec的中文文本自动摘要模型
    龚永罡
    郭远南
    [J]. 中国新通信, 2023, 25 (02) : 65 - 67
  • [10] Arabic Questions Classification Using Modified TF-IDF
    Alammary, Ali Saleh
    [J]. IEEE ACCESS, 2021, 9 : 95109 - 95122