Automatic Taxonomy Classification by Pretrained Language Model

被引:0
|
作者
Kuwana, Ayato [1 ]
Oba, Atsushi [1 ]
Sawai, Ranto [1 ]
Paik, Incheon [1 ]
机构
[1] Univ Aizu, Grad Dept Comp Sci & Informat Syst, Fukushima, Fukui 9658580, Japan
关键词
ontology; automation; natural language processing (NLP); pretrained model;
D O I
10.3390/electronics10212656
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Bloom's Learning Outcomes' Automatic Classification Using LSTM and Pretrained Word Embeddings
    Shaikh, Sarang
    Daudpotta, Sher Muhammad
    Imran, Ali Shariq
    IEEE ACCESS, 2021, 9 (09): : 117887 - 117909
  • [32] A Survey of Pretrained Language Models
    Sun, Kaili
    Luo, Xudong
    Luo, Michael Y.
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, 2022, 13369 : 442 - 456
  • [33] New Automatic Taxonomy Generation Algorithm for the Audio Genre Classification
    Choi, Tacksung
    Moon, Sunkook
    Park, Youngcheol
    Youn, Daehee
    Lee, Seokpil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (03): : 111 - 118
  • [34] An Unsupervised Clinical Acronym Disambiguation Method Based on Pretrained Language Model
    Wei, Siwen
    Yuan, Chi
    Li, Zixuan
    Wang, Huaiyu
    HEALTH INFORMATION PROCESSING, CHIP 2023, 2023, 1993 : 270 - 284
  • [35] Taxonomy and classification of automatic monitoring of program security vulnerability exploitations
    Shahriar, Hossain
    Zulkernine, Mohammad
    JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (02) : 250 - 269
  • [36] Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
    Chronopoulou, Alexandra
    Stojanovski, Dario
    Fraser, Alexander
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2703 - 2711
  • [37] Automatic classification and taxonomy generation for semi-structured data
    Nunes, Bernardo Pereira
    Lopes, Giseli Rabello
    Casanova, Marco Antonio
    CIT/IUCC/DASC/PICOM 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - UBIQUITOUS COMPUTING AND COMMUNICATIONS - DEPENDABLE, AUTONOMIC AND SECURE COMPUTING - PERVASIVE INTELLIGENCE AND COMPUTING, 2015, : 207 - 214
  • [38] On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation
    He, Ruidan
    Liu, Linlin
    Ye, Hai
    Tan, Qingyu
    Ding, Bosheng
    Cheng, Liying
    Low, Jia-Wei
    Bing, Lidong
    Si, Luo
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2208 - 2222
  • [39] Automatic Indonesia's Questions Classification Based On Bloom's Taxonomy Using Natural Language Processing A Preliminary Study
    Kusuma, Selvia Ferdiana
    Siahaan, Daniel
    Yuhana, Umi Laili
    2015 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2015,
  • [40] LEVERAGING ACOUSTIC AND LINGUISTIC EMBEDDINGS FROM PRETRAINED SPEECH AND LANGUAGE MODELS FOR INTENT CLASSIFICATION
    Sharma, Bidisha
    Madhavi, Maulik
    Li, Haizhou
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7498 - 7502