An Unsupervised Approach for Keyphrase Extraction Using Within-Collection Resources

被引:0
|
作者
Li, Teng-Fei [1 ]
Hu, Liang [1 ]
Chu, Jian-Feng [1 ]
Li, Hong-Tu [1 ]
Chi, Ling [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130000, Jilin, Peoples R China
关键词
Phrase extraction; graph-based ranking; topic-based clustering; within-collection resource; NLP;
D O I
10.1109/ACCESS.2019.2938213
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is hard to select and read suitable documents due to the rapidly growing number of scholarly documents. Keyphrases can be considered as the gist of a document so that a researcher can select the documents that they want using keyphrase queries. However, there are also many scholarly documents without any keyphrases tagged by the authors or other researchers. Automatic keyphrase extraction can help researchers to quickly extract keyphrases. This paper proposed an unsupervised approach for keyphrase extraction using graph-based ranking and topic-based clustering under the assumption that we only use the within-collection resources. We use graph-based ranking to describe the relevance between two words and topic-based clustering to embed semantical information into words. In this paper, we assume that each word has its own meaning, and each meaning can be considered as a topic, though we know nothing about these meanings. We use topic-based clustering to assign the "correct meaning" to the "correct word". In addition, by taking the relevance among phrases into consideration and only using within-collection resources, we can use the graph-based ranking in our approach. The edges in a graph that are built for phrases can describe the hidden relevance between two phrases, and the weights that are set for edges can measure the connection between two phrases. Then, after using the position feature, our approach consists of an enhanced graphbased ranking and a topic-based clustering. The experiments are run on four datasets: KDD, WWW, GSN and ACM. The results indicate that our approach has better performance than the state-of-the-art methods.
引用
收藏
页码:126088 / 126097
页数:10
相关论文
共 50 条
  • [1] A Review of Unsupervised Keyphrase Extraction Methods Using Within-Collection Resources
    Sun, Chengyu
    Hu, Liang
    Li, Shuai
    Li, Tuohang
    Li, Hongtu
    Chi, Ling
    [J]. SYMMETRY-BASEL, 2020, 12 (11): : 1 - 20
  • [2] PromptRank: Unsupervised Keyphrase Extraction Using Prompt
    Kong, Aobo
    Zhao, Shiwan
    Chen, Hao
    Li, Qicheng
    Qin, Yong
    Sun, Ruiqi
    Bai, Xiaoyan
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 9788 - 9801
  • [3] Towards unsupervised keyphrase extraction via an autoregressive approach
    Li, Tuohang
    Hu, Liang
    Li, Hongtu
    Sun, Chengyu
    Li, Shuai
    Chi, Ling
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 274
  • [4] PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents
    Florescu, Corina
    Caragea, Cornelia
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1105 - 1115
  • [5] HAKE: an Unsupervised Approach to Automatic Keyphrase Extraction for Multiple Domains
    Merrouni, Zakariae Alami
    Frikh, Bouchra
    Ouhbi, Brahim
    [J]. COGNITIVE COMPUTATION, 2022, 14 (02) : 852 - 874
  • [6] HAKE: an Unsupervised Approach to Automatic Keyphrase Extraction for Multiple Domains
    Zakariae Alami Merrouni
    Bouchra Frikh
    Brahim Ouhbi
    [J]. Cognitive Computation, 2022, 14 : 852 - 874
  • [7] A Fuzzy Approach to Improve an Unsupervised Automatic Keyphrase Extraction Process
    Perez-Guadarrama, Yamel
    Simon-Cuevas, Alfredo
    Hojas-Mazo, Wenny
    Olivas, Jose A.
    Romero, Francisco P.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [8] Keyphrase Distance Analysis Technique from News Articles as a Feature for Keyphrase Extraction: An Unsupervised Approach
    Miah, Mohammad Badrul Alam
    Awang, Suryanti
    Rahman, Md Mustafizur
    Hosen, A. S. M. Sanwar
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 995 - 1002
  • [9] TripleRank: An unsupervised keyphrase extraction algorithm
    Li, Tuohang
    Hu, Liang
    Li, Hongtu
    Sun, Chengyu
    Li, Shuai
    Chi, Ling
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 219 (219)
  • [10] Unsupervised keyphrase extraction for search ontologies
    Gulla, Jon Atle
    Borch, Hans Olaf
    Ingvaldsen, Jon Espen
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2006, 3999 : 25 - 36