Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation

被引：5

作者：

Li, Xiu ^{[1
]}

Henriksson, Aron ^{[1
]}

Duneld, Martin ^{[1
]}

Nouri, Jalal ^{[1
]}

Wu, Yongchao ^{[1
]}

机构：

[1] Stockholm Univ, Dept Comp & Syst Sci, NOD Huset, Borgarfjordsgatan 12, S-16455 Stockholm, Sweden

来源：

FUTURE INTERNET | 2024年 / 16卷 / 01期

关键词：

educational content recommendation; AI-enhanced learning; pre-trained language models; ensemble embeddings; knowledge graph embeddings; text similarity; textual semantic search; natural language processing;

D O I：

10.3390/fi16010012

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner-relying only on readily available pre-trained models-facilitates the development of AI-enhanced learning.

引用

页数：21

共 50 条

[1] On the Sentence Embeddings from Pre-trained Language Models
Li, Bohan
Zhou, Hao
He, Junxian
Wang, Mingxuan
Yang, Yiming
Li, Lei
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9119 - 9130
[2] CollRec: Pre-Trained Language Models and Knowledge Graphs Collaborate to Enhance Conversational Recommendation System
Liu, Shuang
Ao, Zhizhuo
Chen, Peng
Kolmanic, Simon
[J]. IEEE ACCESS, 2024, 12 : 104663 - 104675
[3] Distilling Relation Embeddings from Pre-trained Language Models
Ushio, Asahi
Camacho-Collados, Jose
Schockaert, Steven
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9044 - 9062
[4] Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces
Nayyeri, Mojtaba
Wang, Zihao
Akter, Mst. Mahfuja
Alam, Mirza Mohtashim
Rony, Md Rashad Al Hasan
Lehmann, Jens
Staab, Steffen
[J]. SEMANTIC WEB, ISWC 2023, PART I, 2023, 14265 : 388 - 407
[5] Evaluating Commonsense in Pre-Trained Language Models
Zhou, Xuhui
Zhang, Yue
Cui, Leyang
Huang, Dandan
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9733 - 9740
[6] Empowering News Recommendation with Pre-trained Language Models
Wu, Chuhan
Wu, Fangzhao
Qi, Tao
Huang, Yongfeng
[J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1652 - 1656
[7] Knowledge Inheritance for Pre-trained Language Models
Qin, Yujia
Lin, Yankai
Yi, Jing
Zhang, Jiajie
Han, Xu
Zhang, Zhengyan
Su, Yusheng
Liu, Zhiyuan
Li, Peng
Sun, Maosong
Zhou, Jie
[J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
[8] Probing Simile Knowledge from Pre-trained Language Models
Chen, Weijie
Chang, Yongzhu
Zhang, Rongsheng
Pu, Jiashu
Chen, Guandan
Zhang, Le
Xi, Yadong
Chen, Yijiang
Su, Chang
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5875 - 5887
[9] Evaluating the Summarization Comprehension of Pre-Trained Language Models
Chernyshev, D. I.
Dobrov, B. V.
[J]. LOBACHEVSKII JOURNAL OF MATHEMATICS, 2023, 44 (08) : 3028 - 3039
[10] Evaluating and Inducing Personality in Pre-trained Language Models
Jiang, Guangyuan
Xu, Manjie
Zhu, Song-Chun
Han, Wenjuan
Zhang, Chi
Zhu, Yixin
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →