Ensembling Transformers for Cross-domain Automatic Term Extraction

被引:3
|
作者
Hanh Thi Hong Tran [1 ,2 ,3 ]
Martinc, Matej [1 ]
Pelicon, Andraz [1 ]
Doucet, Antoine [3 ]
Pollak, Senja [2 ]
机构
[1] Jozef Stefan Int Postgrad Sch, Jamova Cesta 39, Ljubljana 1000, Slovenia
[2] Jozef Stefan Inst, Jamova Cesta 39, Ljubljana 1000, Slovenia
[3] Univ La Rochelle, 23 Av Albert Einstein, La Rochelle, France
关键词
Automatic term extraction; ATE; Low resource; ACTER; RSDO5; Monolingual; Cross-domain;
D O I
10.1007/978-3-031-21756-2_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic term extraction plays an essential role in domain language understanding and several natural language processing downstream tasks. In this paper, we propose a comparative study on the predictive power of Transformers-based pretrained language models toward term extraction in a multi-language cross-domain setting. Besides evaluating the ability of monolingual models to extract single- and multiword terms, we also experiment with ensembles of mono- and multilingual models by conducting the intersection or union on the term output sets of different language models. Our experiments have been conducted on the ACTER corpus covering four specialized domains (Corruption, Wind energy, Equitation, and Heart failure) and three languages (English, French, and Dutch), and on the RSDO5 Slovenian corpus covering four additional domains (Biomechanics, Chemistry, Veterinary, and Linguistics). The results show that the strategy of employing monolingual models outperforms the state-of-the-art approaches from the related work leveraging multilingual models, regarding all the languages except Dutch and French if the term extraction task excludes the extraction of named entity terms. Furthermore, by combining the outputs of the two best performing models, we achieve significant improvements.
引用
收藏
页码:90 / 100
页数:11
相关论文
共 50 条
  • [1] Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs
    Howard, Phillip
    Ma, Arden
    Lal, Vasudev
    Simoes, Ana Paula
    Korat, Daniel
    Pereg, Oren
    Wasserblat, Moshe
    Singer, Gadi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 780 - 790
  • [2] Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data
    Hazem, Amir
    Bouhandi, Meriem
    Boudin, Florian
    Daille, Beatrice
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 648 - 662
  • [3] Self-Ensembling GAN for Cross-Domain Semantic Segmentation
    Xu, Yonghao
    He, Fengxiang
    Du, Bo
    Tao, Dacheng
    Zhang, Liangpei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7837 - 7850
  • [4] Automatic Classification of Cross-Domain Opinions
    Guzman Cabrera, Rafael
    COMPUTACION Y SISTEMAS, 2019, 23 (04): : 1541 - 1548
  • [5] Can Cross-Domain Term Extraction Benefit from Cross-lingual Transfer?
    Tran, Hanh Thi Hong
    Martinc, Matej
    Doucet, Antoine
    Pollak, Senja
    DISCOVERY SCIENCE (DS 2022), 2022, 13601 : 363 - 378
  • [6] Automatic Association of Cross-Domain Network Topology
    Han, Sai
    Wang, Zelin
    Wang, Guangquan
    Fang, Qiukeng
    Ma, Hongbing
    Lin, Lin
    Xu, Lexi
    Zhang, Heng
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1173 - 1178
  • [7] Opinion-based Relational Pivoting for Cross-domain Aspect Term Extraction
    Klein, Ayal
    Pereg, Oren
    Korat, Daniel
    Lal, Vasudev
    Wasserblat, Moshe
    Dagan, Ido
    PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSIS, 2022, : 104 - 112
  • [8] Cross-domain aspect term extraction incorporating character-level features
    Wang, DengXiong
    Li, Weijiang
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 120
  • [9] Can cross-domain term extraction benefit from cross-lingual transfer and nested term labeling?
    Tran, Hanh Thi Hong
    Martinc, Matej
    Repar, Andraz
    Ljubesic, Nikola
    Doucet, Antoine
    Pollak, Senja
    MACHINE LEARNING, 2024, 113 (07) : 4285 - 4314
  • [10] Latent mutual feature extraction for cross-domain recommendation
    Park, Hoon
    Jung, Jason J.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (06) : 3337 - 3354