Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation

被引：5

作者：

Li, Xiu ^{[1
]}

Henriksson, Aron ^{[1
]}

Duneld, Martin ^{[1
]}

Nouri, Jalal ^{[1
]}

Wu, Yongchao ^{[1
]}

机构：

[1] Stockholm Univ, Dept Comp & Syst Sci, NOD Huset, Borgarfjordsgatan 12, S-16455 Stockholm, Sweden

来源：

FUTURE INTERNET | 2024年 / 16卷 / 01期

关键词：

educational content recommendation; AI-enhanced learning; pre-trained language models; ensemble embeddings; knowledge graph embeddings; text similarity; textual semantic search; natural language processing;

D O I：

10.3390/fi16010012

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner-relying only on readily available pre-trained models-facilitates the development of AI-enhanced learning.

引用

页数：21

共 50 条

[41] DistillingWord Meaning in Context from Pre-trained Language Models
Arase, Yuki
Kajiwara, Tomoyuki
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 534 - 546
[42] Pre-trained language models in medicine: A survey *
Luo, Xudong
Deng, Zhiqi
Yang, Binxia
Luo, Michael Y.
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
[43] Probing for Hyperbole in Pre-Trained Language Models
Schneidermann, Nina Skovgaard
Hershcovich, Daniel
Pedersen, Bolette Sandford
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211
[44] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
Zhang, Jianyi
Muhamed, Aashiq
Anantharaman, Aditya
Wang, Guoyin
Chen, Changyou
Zhong, Kai
Cui, Qingjun
Xu, Yi
Zeng, Belinda
Chilimbi, Trishul
Chen, Yiran
[J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
[45] SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models
Wang, Liang
Zhao, Wei
Wei, Zhuoyu
Liu, Jingming
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4281 - 4294
[46] Assisted Process Knowledge Graph Building Using Pre-trained Language Models
Bellan, Patrizio
Dragoni, Mauro
Ghidini, Chiara
[J]. AIXIA 2022 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2023, 13796 : 60 - 74
[47] Dual-View Whitening on Pre-trained Text Embeddings for Sequential Recommendation
Zhang, Lingzi
Zhou, Xin
Zeng, Zhiwei
Shen, Zhiqi
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8, 2024, : 9332 - 9340
[48] Improving Pre-trained Vision-and-Language Embeddings for Phrase Grounding
Dou, Zi-Yi
Peng, Nanyun
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6362 - 6371
[49] A Study of Pre-trained Language Models in Natural Language Processing
Duan, Jiajia
Zhao, Hui
Zhou, Qian
Qiu, Meikang
Liu, Meiqin
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
[50] Optimizing and Evaluating Pre-Trained Large Language Models for Alzheimer's Disease Detection
Casu, Filippo
Grosso, Enrico
Lagorio, Andrea
Trunfio, Giuseppe A.
[J]. 2024 32ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PDP 2024, 2024, : 277 - 284

← 1 2 3 4 5 →