Augmenting pre-trained language models with audio feature embedding for argumentation mining in political debates

被引：0

作者：

Mestre, Rafael ^{[1
]}

Middleton, Stuart E. ^{[1
]}

Ryan, Matt ^{[1
]}

Gheasi, Masood ^{[1
]}

Norman, Timothy J. ^{[1
]}

Zhu, Jiatong ^{[1
]}

机构：

[1] Univ Southampton, Southampton, Hants, England

来源：

17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023 | 2023年

基金：

英国自然环境研究理事会; 英国科研创新办公室; 英国经济与社会研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The integration of multimodality in natural language processing (NLP) tasks seeks to exploit the complementary information contained in two or more modalities, such as text, audio and video. This paper investigates the integration of often under-researched audio features with text, using the task of argumentation mining (AM) as a case study. We take a previously reported dataset and present an audio-enhanced version (the Multimodal USElecDeb60To16 dataset). We report the performance of two text models based on BERT and GloVe embeddings, one audio model (based on CNN and Bi-LSTM) and multimodal combinations, on a dataset of 28,850 utterances. The results show that multimodal models do not outperform text-based models when using the full dataset. However, we show that audio features add value in fully supervised scenarios with limited data. We find that when data is scarce (e.g. with 10% of the original dataset) multimodal models yield improved performance, whereas text models based on BERT considerably decrease performance. Finally, we conduct a study with artificially generated voices and an ablation study to investigate the importance of different audio features in the audio models.

引用

页码：274 / 288

页数：15

共 50 条

[1] Using various pre-trained models for audio feature extraction in automated audio captioning
Won, Hyejin
Kim, Baekseung
Kwak, Il-Youp
Lim, Changwon
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
[2] Mining Logical Event Schemas From Pre-Trained Language Models
Lawley, Lane
Schubert, Lenhart
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 332 - 345
[3] ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining
Minh Phuc Nguyen
Vu Hoang Tran
Vu Hoang
Ta Duc Huy
Bui, Trung H.
Truong, Steven Q. H.
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 328 - 337
[4] Pre-Trained Language Models and Their Applications
Wang, Haifeng
Li, Jiwei
Wu, Hua
Hovy, Eduard
Sun, Yu
[J]. ENGINEERING, 2023, 25 (51-65) : 51 - 65
[5] Leveraging pre-trained language models for mining microbiome-disease relationships
Nikitha Karkera
Sathwik Acharya
Sucheendra K. Palaniappan
[J]. BMC Bioinformatics, 24
[6] Leveraging pre-trained language models for mining microbiome-disease relationships
Karkera, Nikitha
Acharya, Sathwik
Palaniappan, Sucheendra K.
[J]. BMC BIOINFORMATICS, 2023, 24 (01)
[7] Annotating Columns with Pre-trained Language Models
Suhara, Yoshihiko
Li, Jinfeng
Li, Yuliang
Zhang, Dan
Demiralp, Cagatay
Chen, Chen
Tan, Wang-Chiew
[J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
[8] LaoPLM: Pre-trained Language Models for Lao
Lin, Nankai
Fu, Yingwen
Yang, Ziyu
Chen, Chuwei
Jiang, Shengyi
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
[9] PhoBERT: Pre-trained language models for Vietnamese
Dat Quoc Nguyen
Anh Tuan Nguyen
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
[10] HinPLMs: Pre-trained Language Models for Hindi
Huang, Xixuan
Lin, Nankai
Li, Kexin
Wang, Lianxi
Gan, Suifu
[J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246

← 1 2 3 4 5 →