Scratch-DKG: A Framework for Constructing Scratch Domain Knowledge Graph

被引:5
|
作者
Qi, Peng [1 ]
Sun, Yan [1 ]
Luo, Hong [1 ]
Guizani, Mohsen [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Dept Comp Sci, Beijing 100876, Peoples R China
[2] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
基金
中国国家自然科学基金;
关键词
Data mining; Feature extraction; Labeling; Programming profession; Tools; Visualization; Scratch; Knowledge Graph; DeepDive; Secondary Labeling Algorithm; S-TextRank; programming knowledge points; EXTRACTION;
D O I
10.1109/TETC.2020.2996710
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of programming platforms, how to utilize the tremendous amount of data produced by the platforms, such as Scatch, has been a big challenge to researchers. The growing data is not only huge, but also heterogeneous and diverse, leading that the existing tools cannot effectively extract valuable information. In this article, considering particular features of Scratch data, we propose an effective framework about constructing a Scratch Domain Knowledge Graph (Scratch-DKG). Our framework includes four modules which are designed to process the semi-structured data, users profile data, projects data and programming knowledge points, respectively. For webpages, we design a template-based wrapper method to extract triples from the semi-structured data. As for users profile data, we improve DeepDive, which is a useful tool to extract information but with the problem of wrong labeling, to extract knowledge triples by the proposed Secondary Labeling Algorithm. For projects data, we propose an advanced keywords extraction method (S-TextRank) to extract keywords triples. For programming knowledge points, we develop a frequently contiguous block combinations mining algorithm to extract the potential domain information of Scratch. Finally, extensive experiments are carried out to evaluate the performance of our proposed methods. The experimental results show that, compared to other competing methods, our proposal can extract more correct and comprehensive Scratch triples.
引用
收藏
页码:170 / 185
页数:16
相关论文
共 50 条
  • [1] DKGBuilder: An Architecture for Building a Domain Knowledge Graph from Scratch
    Fan, Yan
    Wang, Chengyu
    Zhou, Guomin
    He, Xiaofeng
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT II, 2017, 10178 : 663 - 667
  • [2] Research of Scratch Programming Recommendation System Based on MED and Knowledge Graph
    He Yan-ting
    Guo Ben-Jun
    Lu Jun
    Xu Yuan-ping
    Gong Mei
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 2158 - 2163
  • [3] A Knowledge Graph for Eldercare: Constructing a Domain Entity Graph with Guidelines
    Duan, You
    Ji, Pin
    Jin, Liuqi
    Zou, Anning
    Yang, Jiaoyun
    Xie, Hong
    An, Ning
    [J]. HUMAN ASPECTS OF IT FOR THE AGED POPULATION: APPLICATIONS IN HEALTH, ASSISTANCE, AND ENTERTAINMENT, PT II, 2018, 10927 : 25 - 35
  • [4] PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model
    Ide, Tatsuya
    Murata, Eiki
    Kawahara, Daisuke
    Yamazaki, Takato
    Li, Shengzhe
    Shinzato, Kenta
    Sato, Toshinori
    [J]. arXiv, 2023,
  • [5] Scratch-RL: A preference-driven adversarial reinforcement reasoning framework over knowledge graphs for explainable recommendation of Scratch
    Qi, Peng
    Sun, Yan
    Luo, Hong
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 8113 - 8138
  • [6] A versatile approach for constructing a domain knowledge graph for culture
    Wei, Jingzhu
    Liu, Rui
    [J]. Proceedings of the Association for Information Science and Technology, 2019, 56 (01) : 808 - 809
  • [7] Constructing large scale biomedical knowledge bases from scratch with rapid annotation of interpretable patterns
    Fauqueur, Julien
    Thillaisundaram, Ashok
    Togia, Theodosia
    [J]. SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2019), 2019, : 142 - 151
  • [8] Creating the Scratch Cooked School Food Framework: Qualitative Data Analysis of a Scratch Cooking Pilot
    Ahmed, Deeana Ijaz
    Trent, Raynika
    Koch, Pamela
    [J]. HEALTH PROMOTION PRACTICE, 2022, 23 (06) : 963 - 972
  • [9] An Framework for collecting and analyzing interactions in Scratch projects
    Nachtigall, Jorge
    Primo, Tiago
    Pernas, Ana
    Maraschin, Dirceu
    [J]. 2019 XIV LATIN AMERICAN CONFERENCE ON LEARNING TECHNOLOGIES (LACLO 2019), 2020, : 50 - 54
  • [10] Constructing biomedical domain-specific knowledge graph with minimum supervision
    Jianbo Yuan
    Zhiwei Jin
    Han Guo
    Hongxia Jin
    Xianchao Zhang
    Tristram Smith
    Jiebo Luo
    [J]. Knowledge and Information Systems, 2020, 62 : 317 - 336