Automatic Taxonomy Classification by Pretrained Language Model

被引:0
|
作者
Kuwana, Ayato [1 ]
Oba, Atsushi [1 ]
Sawai, Ranto [1 ]
Paik, Incheon [1 ]
机构
[1] Univ Aizu, Grad Dept Comp Sci & Informat Syst, Fukushima, Fukui 9658580, Japan
关键词
ontology; automation; natural language processing (NLP); pretrained model;
D O I
10.3390/electronics10212656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] INTEGRATING PRETRAINED LANGUAGE MODEL FOR DIALOGUE POLICY EVALUATION
    Wang, Hongru
    Wang, Huimin
    Wang, Zezhong
    Wong, Kam-Fai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6692 - 6696
  • [22] BatteryBERT: A Pretrained Language Model for Battery Database Enhancement
    Huang, Shu
    Cole, Jacqueline M.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (24) : 6365 - 6377
  • [23] A Survey on Model Compression and Acceleration for Pretrained Language Models
    Xu, Canwen
    McAuley, Julian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10566 - 10575
  • [24] A Novel Pretrained General-purpose Vision Language Model for the Vietnamese Language
    Dinh Anh Vu
    Quang Nhat Minh Pham
    Giang Son Tran
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)
  • [25] Automatic subgenre classification in an electronic dance music taxonomy
    Caparrini, Antonio
    Arroyo, Javier
    Perez-Molina, Laura
    Sanchez-Hernandez, Jaime
    JOURNAL OF NEW MUSIC RESEARCH, 2020, 49 (03) : 269 - 284
  • [26] ChestXRayBERT: A Pretrained Language Model for Chest Radiology Report Summarization
    Cai, Xiaoyan
    Liu, Sen
    Han, Junwei
    Yang, Libin
    Liu, Zhenguo
    Liu, Tianming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 845 - 855
  • [27] Predicting Immune Escape with Pretrained Protein Language Model Embeddings
    Swanson, Kyle
    Chang, Howard
    Zou, James
    MACHINE LEARNING IN COMPUTATIONAL BIOLOGY, VOL 200, 2022, 200
  • [28] The Classification of Short Scientific Texts Using Pretrained BERT Model
    Danilov, Gleb
    Ishankulov, Timur
    Kotik, Konstantin
    Orlov, Yuriy
    Shifrin, Mikhail
    Potapov, Alexander
    PUBLIC HEALTH AND INFORMATICS, PROCEEDINGS OF MIE 2021, 2021, 281 : 83 - 87
  • [29] Automatic Music Genre Classification Using a Hierarchical Clustering and a Language Model Approach
    Langlois, Thibault
    Marques, Goncalo
    2009 FIRST INTERNATIONAL CONFERENCE ON ADVANCES IN MULTIMEDIA, 2009, : 188 - +
  • [30] Intelligent Classification and Automatic Annotation of Violations based on Neural Network Language Model
    Yu, Yaoquan
    Guo, Yuefeng
    Zhang, Zhiyuan
    Li, Mengshi
    Ji, Tianyao
    Tang, Wenhu
    Wu, Qinghua
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,