Vietnamese Span-based Constituency Parsing with BERT Embedding

被引:0
|
作者
Phan, Thi-Phuong-Uyen [1 ]
Huynh, Ngoc-Thanh-Tung [1 ]
Truong, Hung-Thinh [1 ]
机构
[1] Univ Sci VNU HCMC, Fac Informat Technol, Ho Chi Minh City, Vietnam
关键词
constituency parsing; span-based parsing; contextualized word representation;
D O I
10.1109/kse.2019.8919467
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Syntactic structure of sentences obtained from Constituency Parsing is fundamental information in many Natural Language Processing tasks. However, due to the lack of available resources and the complex linguistic features of Vietnamese, the research into Constituency Parsing has not received enough attention in this language. To the best of our knowledge, the study presented in this paper is one of the first investigations to explore this task in Vietnamese. In this work, we present a Span-based approach which focuses on representing spans through the use of contextualized pre-trained embeddings to obtain optimal parse trees for Vietnamese sentences. The conducted experiments indicate that our system achieved promising results on the VLSP Vietnamese Treebank dataset by significantly outperforming existing methods. The results of this study support the view that encoding context information into the representation of words is effective in improving the parsing performance of Vietnamese. Consequently, this idea can be generalized to apply to other tasks such as Dependency Parsing or other low-resource languages.
引用
收藏
页码:293 / 299
页数:7
相关论文
共 50 条
  • [11] Span-Based Event Coreference Resolution
    Lu, Jing
    Ng, Vincent
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13489 - 13497
  • [12] A Span-based Linearization for Constituent Trees
    Wei, Yang
    Wu, Yuanbin
    Lan, Man
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3267 - 3277
  • [13] Span-based network characterization for bridge management
    Grivas, Dimitri A.
    Schultz, B.Cameron
    Elwell, David J.
    Dalto, Anthony E.
    [J]. Transportation Research Record, 1994, (1442) : 123 - 127
  • [14] Span-based Audio-Visual Localization
    Wu, Yiling
    Zhang, Xinfeng
    Wang, Yaowei
    Huang, Qingming
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1252 - 1260
  • [15] A Span-based Multivariate Information-aware Embedding Network for joint relational triplet extraction of threat intelligence
    Shang, Wenli
    Wang, Bowen
    Zhu, Pengcheng
    Ding, Lei
    Wang, Shuang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [16] Unlexicalized Transition-based Discontinuous Constituency Parsing
    Coavoux, Maximin
    Crabbe, Benoit
    Cohen, Shay B.
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 73 - 89
  • [17] Joint Learning of Token Context and Span Feature for Span-Based Nested NER
    Sun, Lin
    Sun, Yuxuan
    Ji, Fule
    Wang, Chi
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2720 - 2730
  • [18] SPTNET: Span-based Prompt Tuning for Video Grounding
    Zhang, Yiren
    Xu, Yuanwu
    Chen, Mohan
    Zhang, Yuejie
    Feng, Rui
    Gao, Shang
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2807 - 2812
  • [19] Constituency Parsing of Bulgarian: Word- vs. Class-based Parsing
    Ghayoomi, Masood
    Simov, Kiril
    Osenova, Petya
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 4056 - 4060
  • [20] A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents
    Xia, Qingrong
    Zhang, Bo
    Wang, Rui
    Li, Zhenghua
    Zhang, Yue
    Huang, Fei
    Si, Luo
    Zhang, Min
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1795 - 1804