A large-scale dataset for korean document-level relation extraction from encyclopedia texts

被引：0

作者：

Son, Suhyune ^{[1
]}

Lim, Jungwoo ^{[1
]}

Koo, Seonmin ^{[1
]}

Kim, Jinsung ^{[1
]}

Kim, Younghoon ^{[2
]}

Lim, Youngsik ^{[2
]}

Hyun, Dongseok ^{[2
]}

Lim, Heuiseok ^{[1
]}

机构：

[1] Korea Univ, Comp Sci & Engn, 1 5-ka,Anam Dong, Seoul 02841, South Korea

[2] NAVER, 5 Jeongjail ro,Buljeong ro, Seongnam 13561, South Korea

来源：

APPLIED INTELLIGENCE | 2024年 / 54卷 / 17-18期

基金：

新加坡国家研究基金会;

关键词：

Natural Language Processing; Information Extraction; Document-level Relation Extraction; Korean Relation Extraction; ENTITY;

D O I：

10.1007/s10489-024-05605-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK (Toward Document-Level Relation Extraction in Korean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language's properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.

引用

页码：8681 / 8701

页数：21

共 50 条

[1] DocRED: A Large-Scale Document-Level Relation Extraction Dataset
Yao, Yuan
Ye, Deming
Li, Peng
Han, Xu
Lin, Yankai
Liu, Zhenghao
Liu, Zhiyuan
Huang, Lixin
Zhou, Jie
Sun, Maosong
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 764 - 777
[2] DuEE-Fin: A Large-Scale Dataset for Document-Level Event Extraction
Han, Cuiyun
Zhang, Jinchuan
Li, Xinyu
Xu, Guojin
Peng, Weihua
Zeng, Zengfeng
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 172 - 183
[3] HistRED: A Historical Document-Level Relation Extraction Dataset
Yang, Soyoung
Choi, Minseok
Cho, Youngwoo
Choo, Jaegul
arXiv, 2023,
[4] DOCNLI: A Large-scale Dataset for Document-level Natural Language Inference
Yin, Wenpeng
Radev, Dragomir
Xiong, Caiming
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4913 - 4922
[5] HistRED: A Historical Document-Level Relation Extraction Dataset
Yang, Soyoung
Choi, Minseok
Cho, Youngwoo
Choo, Jaegul
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3207 - 3224
[6] Survey on Document-Level Relation Extraction
Zhou Y.
Huang H.
Liu H.
Hao Z.
Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2022, 50 (04): : 10 - 25
[7] Document-Level Relation Extraction with Reconstruction
Xu, Wang
Chen, Kehai
Zhao, Tiejun
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14167 - 14175
[8] AutoRE: Document-Level Relation Extraction with Large Language Models
Xue, Lilong
Zhang, Dan
Dong, Yuxiao
Tang, Jie
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 3: SYSTEM DEMONSTRATIONS, 2024, : 211 - 220
[9] Document-level Relation Extraction with Relation Correlations
Han, Ridong
Peng, Tao
Wang, Benyou
Liu, Lu
Tiwari, Prayag
Wan, Xiang
NEURAL NETWORKS, 2024, 171 : 14 - 24
[10] DocEE: A Large-Scale and Fine-grained Benchmark for Document-level Event Extraction
Tong, Meihan
Xu, Bin
Wang, Shuai
Han, Meihuan
Cao, Yixin
Zhu, Jiangqi
Chen, Siyu
Hou, Lei
Li, Juanzi
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3970 - 3982

← 1 2 3 4 5 →