Unsupervised Topic-Oriented Keyphrase Extraction and Its Application to Croatian

被引:0
|
作者
Saratlija, Josip [1 ]
Snajder, Jan [1 ]
Basic, Bojana Dalbelo [1 ]
机构
[1] Univ Zagreb, Fac Elect Engn & Comp, Zagreb 41000, Croatia
来源
关键词
Information extraction; keyphrase extraction; unsupervised learning; k-means; Croatian language;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Labeling documents with keyphrases is a tedious and expensive task. Most approaches to automatic keyphrases extraction rely on supervised learning and require manually labeled training data. In this paper we propose a fully unsupervised keyphrase extraction method, differing from the usual generic keyphrase extractor in the manner the keyphrases are formed. Our method begins by building topically related word clusters from which document keywords are selected, and then expands the selected keywords into syntactically valid keyphrases. We evaluate our approach on a Croatian document collection annotated by eight human experts, taking into account the high subjectivity of the keyphrase extraction task. The performance of the proposed method reaches up to F1 = 44.5%, which is outperformed by human annotators, but comparable to a supervised approach.
引用
收藏
页码:340 / 347
页数:8
相关论文
共 50 条
  • [1] AdaptiveUKE: Towards adaptive unsupervised keyphrase extraction with gated topic modeling
    Liu, Qi
    Ke, Wenjun
    Yuan, Xiaoguang
    Yang, Yuting
    Zhao, Hua
    Wang, Peng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250
  • [2] Topic-Oriented Dialogue Summarization
    Lin, Haitao
    Zhu, Junnan
    Xiang, Lu
    Zhai, Feifei
    Zhou, Yu
    Zhang, Jiajun
    Zong, Chengqing
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1797 - 1810
  • [3] Topic-oriented mining and reasoning
    Li, YF
    Zhong, N
    Yao, YY
    [J]. PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON ACTIVE MEDIA TECHNOLOGY (AMT 2005), 2005, : 321 - 326
  • [4] Generative non-autoregressive unsupervised keyphrase extraction with neural topic modeling
    Zhu, Xun
    Lou, Yinxia
    Zhao, Jing
    Gao, Wang
    Deng, Hongtao
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 120
  • [5] KeyVector: Unsupervised Keyphrase Extraction Using Weighted Topic via Semantic Relatedness
    Toleu, Alymzhan
    Tolegen, Gulmira
    Mussabayev, Rustam
    [J]. COMPUTACION Y SISTEMAS, 2019, 23 (03): : 861 - 869
  • [6] Topic-oriented measurement of microblogging network
    Liu, Wei
    Wang, Li-Hong
    Li, Rui-Guang
    [J]. Tongxin Xuebao/Journal on Communications, 2013, 34 (11): : 171 - 178
  • [7] Contextual topic discovery using unsupervised keyphrase extraction and hierarchical semantic graph model
    Du, Hung
    Thudumu, Srikanth
    Giardina, Antonio
    Vasa, Rajesh
    Mouzakis, Kon
    Jiang, Li
    Chisholm, John
    Bista, Sanat
    [J]. JOURNAL OF BIG DATA, 2023, 10 (01)
  • [8] Contextual topic discovery using unsupervised keyphrase extraction and hierarchical semantic graph model
    Hung Du
    Srikanth Thudumu
    Antonio Giardina
    Rajesh Vasa
    Kon Mouzakis
    Li Jiang
    John Chisholm
    Sanat Bista
    [J]. Journal of Big Data, 10
  • [9] SPARSE MODELING FOR TOPIC-ORIENTED VIDEO SUMMARIZATION
    Panda, Rameswar
    Roy-Chowdhury, Amit K.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 1388 - 1392
  • [10] TripleRank: An unsupervised keyphrase extraction algorithm
    Li, Tuohang
    Hu, Liang
    Li, Hongtu
    Sun, Chengyu
    Li, Shuai
    Chi, Ling
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 219 (219)