Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation

被引:5
|
作者
Li, Xiu [1 ]
Henriksson, Aron [1 ]
Duneld, Martin [1 ]
Nouri, Jalal [1 ]
Wu, Yongchao [1 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, NOD Huset, Borgarfjordsgatan 12, S-16455 Stockholm, Sweden
关键词
educational content recommendation; AI-enhanced learning; pre-trained language models; ensemble embeddings; knowledge graph embeddings; text similarity; textual semantic search; natural language processing;
D O I
10.3390/fi16010012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner-relying only on readily available pre-trained models-facilitates the development of AI-enhanced learning.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] DistillingWord Meaning in Context from Pre-trained Language Models
    Arase, Yuki
    Kajiwara, Tomoyuki
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 534 - 546
  • [42] Pre-trained language models in medicine: A survey *
    Luo, Xudong
    Deng, Zhiqi
    Yang, Binxia
    Luo, Michael Y.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
  • [43] Probing for Hyperbole in Pre-Trained Language Models
    Schneidermann, Nina Skovgaard
    Hershcovich, Daniel
    Pedersen, Bolette Sandford
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211
  • [44] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
    Zhang, Jianyi
    Muhamed, Aashiq
    Anantharaman, Aditya
    Wang, Guoyin
    Chen, Changyou
    Zhong, Kai
    Cui, Qingjun
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    Chen, Yiran
    [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
  • [45] SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models
    Wang, Liang
    Zhao, Wei
    Wei, Zhuoyu
    Liu, Jingming
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4281 - 4294
  • [46] Assisted Process Knowledge Graph Building Using Pre-trained Language Models
    Bellan, Patrizio
    Dragoni, Mauro
    Ghidini, Chiara
    [J]. AIXIA 2022 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2023, 13796 : 60 - 74
  • [47] Dual-View Whitening on Pre-trained Text Embeddings for Sequential Recommendation
    Zhang, Lingzi
    Zhou, Xin
    Zeng, Zhiwei
    Shen, Zhiqi
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9332 - 9340
  • [48] Improving Pre-trained Vision-and-Language Embeddings for Phrase Grounding
    Dou, Zi-Yi
    Peng, Nanyun
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6362 - 6371
  • [49] A Study of Pre-trained Language Models in Natural Language Processing
    Duan, Jiajia
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
  • [50] Optimizing and Evaluating Pre-Trained Large Language Models for Alzheimer's Disease Detection
    Casu, Filippo
    Grosso, Enrico
    Lagorio, Andrea
    Trunfio, Giuseppe A.
    [J]. 2024 32ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PDP 2024, 2024, : 277 - 284