Diverse feature set based Keyphrase extraction and indexing techniques

被引:0
|
作者
Saurabh Sharma
Vishal Gupta
Mamta Juneja
机构
[1] University Institute of Engineering & Technology,
[2] Panjab University,undefined
来源
关键词
Keyphrase extraction; Word embedding; Keyphrase indexing; External knowledge; Free indexing; Natural language processing;
D O I
暂无
中图分类号
学科分类号
摘要
The internet changed the way that people communicate, and this has led to a vast amount of Text that is available in electronic format. It includes things like e-mail, technical and scientific reports, tweets, physician notes and military field reports. Providing key-phrases for these extensive text collections thus allows users to grab the essence of the lengthy contents quickly and helps to locate information with high efficiency. While designing a Keyword Extraction and Indexing system, it is essential to pick unique properties, called features. In this article, we proposed different unsupervised keyword extraction approaches, which is independent of the structure, size and domain of the documents. The proposed method relies on the novel and cognitive inspired set of standard, phrase, word embedding and external knowledge source features. The individual and selected feature results are reported through experimentation on four different datasets viz. SemEval, KDD, Inspec, and DUC. The selected (feature selection) and word embedding based features are the best features set to be used for keywords extraction and indexing among all mentioned datasets. That is the proposed distributed word vector with additional knowledge improves the results significantly over the use of individual features, combined features after feature selection and state-of-the-art. After successfully achieving the objective of developing various keyphrase extraction methods we also experimented it for document classification task.
引用
收藏
页码:4111 / 4142
页数:31
相关论文
共 50 条
  • [31] Information and Rough Set Theory Based Feature Selection Techniques
    Cervante, Liam
    Gao, Xiaoying
    [J]. ACTIVE MEDIA TECHNOLOGY, AMT 2013, 2013, 8210 : 166 - 176
  • [32] Automatic Keyphrase Extraction based on NLP and Statistical Mathods
    Dostal, Martin
    Jezek, Karel
    [J]. DATESO 2011: DATABASES, TEXTS, SPECIFICATIONS, OBJECTS, 2011, 706 : 140 - 145
  • [33] WikiRank:Improving Keyphrase Extraction Based on Background Knowledge
    Yu, Yang
    Ng, Vincent
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3723 - 3727
  • [34] TopicLPRank: a keyphrase extraction method based on improved TopicRank
    Shengbin Liao
    Zongkai Yang
    Qingzhou Liao
    Zhangxiong zheng
    [J]. The Journal of Supercomputing, 2023, 79 : 9073 - 9092
  • [35] Audio indexing using feature warping and fusion techniques
    Sénac, C
    Ambikairajah, E
    [J]. 2004 IEEE 6TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2004, : 359 - 362
  • [36] A Graph-based Approach of Automatic Keyphrase Extraction
    Yan Ying
    Tan Qingping
    Xie Qinzheng
    Zeng Ping
    Li Panpan
    [J]. ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 107 : 248 - 255
  • [37] An Image-based Recommender System Based on Feature Extraction Techniques
    Kurt, Zuhal
    Ozkan, Kemal
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 769 - 774
  • [38] First Order Statistics Based Feature Selection: A Diverse and Powerful Family of Feature Seleciton Techniques
    Khoshgoftaar, Taghi
    Dittman, David
    Wald, Randall
    Fazelpour, Alireza
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 151 - 157
  • [39] Recognition Techniques and Feature Extraction for Space Targets Based on RCS
    Zhao Anjun
    Niu Wei
    Guo Lei
    [J]. PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 4, 2008, : 426 - +
  • [40] PCA Indexing based Feature Learning and Feature Selection
    Ibrahim, Marwa Farouk Ibrahim
    Al-Jumaily, Adel Ali
    [J]. 2016 8TH CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE (CIBEC), 2016, : 68 - 71