Towards unsupervised keyphrase extraction via an autoregressive approach

被引:1
|
作者
Li, Tuohang [1 ]
Hu, Liang [1 ]
Li, Hongtu [1 ]
Sun, Chengyu [1 ]
Li, Shuai [1 ]
Chi, Ling [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Keyphrase extraction; Autoregressive structure; Optimizer; Unsupervised model; Coverage decay optimizer;
D O I
10.1016/j.knosys.2023.110664
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrase extraction is a technique used to capture the core information of documents and is an upstream task for advanced information retrieval systems, particularly in the academic realm. Current unsupervised methods are primarily built on a score-and-rank framework with a consistent inability to acquire mutual information between extracted keyphrases, especially with graph-based models. Utilizing the autoregressive structure that is typically used in sequence-to-sequence text generation models, we propose a plug-and-play optimizer named C-Decay that can be integrated into any graph -based unsupervised keyphrase extraction model for a stable performance boost, and that mitigates the bias of certain semantically or lexically dominant tokens by optimizing the origin score distribution output by graph-based models directly. The architecture of C-Decay includes the keyphrase pool, the gain vector and the decay factor, where the keyphrase pool is designed to realize an autoregressive structure and the gain vector and the decay factor are the optimization operator. Herein, we examine three graph-based models integrated with C-Decay, and the experiment is conducted on four datasets KDD, Semeval, Nguyen, and Krapivin. Moreover, we prove that C-Decay can improve accuracy and F-Measure by an average of approximately 50% and 20%, respectively.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] PromptRank: Unsupervised Keyphrase Extraction Using Prompt
    Kong, Aobo
    Zhao, Shiwan
    Chen, Hao
    Li, Qicheng
    Qin, Yong
    Sun, Ruiqi
    Bai, Xiaoyan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9788 - 9801
  • [22] KeyVector: Unsupervised Keyphrase Extraction Using Weighted Topic via Semantic Relatedness
    Toleu, Alymzhan
    Tolegen, Gulmira
    Mussabayev, Rustam
    COMPUTACION Y SISTEMAS, 2019, 23 (03): : 861 - 869
  • [23] How Preprocessing Affects Unsupervised Keyphrase Extraction
    Wang, Rui
    Liu, Wei
    McDonald, Chris
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2014, PT I, 2014, 8403 : 163 - 176
  • [24] NamedKeys: Unsupervised Keyphrase Extraction for Biomedical Documents
    Gero, Zelalem
    Ho, Joyce C.
    ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 328 - 337
  • [25] A Ranking Approach to Keyphrase Extraction
    Jiang, Xin
    Hu, Yunhua
    Li, Hang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 756 - 757
  • [26] HyperRank: Hyperbolic Ranking Model for Unsupervised Keyphrase Extraction
    Song, Mingyang
    Liu, Huafeng
    Jing, Liping
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16070 - 16080
  • [27] A New Scheme for Scoring Phrases in Unsupervised Keyphrase Extraction
    Florescu, Corina
    Caragea, Cornelia
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017, 2017, 10193 : 477 - 483
  • [28] An Integrated Approach for Keyphrase Generation via Exploring the Power of Retrieval and Extraction
    Chen, Wang
    Chan, Hou Pong
    Li, Piji
    Bing, Lidong
    King, Irwin
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2846 - 2856
  • [29] ISKE: An unsupervised automatic keyphrase extraction approach using the iterated sentences based on graph method
    Chi, Ling
    Hu, Liang
    KNOWLEDGE-BASED SYSTEMS, 2021, 223
  • [30] Unsupervised Keyphrase Extraction in Academic Publications Using Human Attention
    Zhang, Yingyi
    Zhang, Chengzhi
    17TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI2019), VOL II, 2019, : 2483 - 2484