Dynamic Hybrid Clustering of Bioinformatics by Incorporating Text Mining and Citation Analysis

被引:0
|
作者
Janssens, Frizo [1 ]
Glanzel, Wolfgang [2 ]
De Moor, Bart [1 ]
机构
[1] Katholieke Univ Leuven, Elect Engn ESAT, Kasteelpk Arenberg 10, B-3001 Leuven, Belgium
[2] Katholieke Univ Leuven, Steunpunt O&O Indicatoren, B-3000 Louvain, Belgium
关键词
Fisher's inverse chi-square method; cluster chains;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1981-2004. For delineating this complex, interdisciplinary field, a novel bibliometric retrieval strategy is used. Given that the performance of unsupervised clustering and classification of scientific publications is significantly improved by deeply merging textual contents with the structure of the citation graph, we proceed with a hybrid clustering method based on Fisher's inverse chi-square. The optimal number of clusters is determined by a compound semiautomatic strategy comprising a combination of distance-based and stability-based methods. We also investigate the relationship between number of Latent Semantic Indexing factors, number of clusters, and clustering performance. The HITS and PageRank algorithms are used to determine representative publications in each cluster. Next, we develop a methodology for dynamic hybrid clustering of evolving bibliographic data sets. The same clustering methodology is applied to consecutive periods defined by time windows on the set, and in a subsequent phase chains are formed by matching and tracking clusters through time. Term networks for the eleven resulting cluster chains present the cognitive structure of the field. Finally, we provide a view on how much attention the bioinformatics community has devoted to the different subfields through time.
引用
收藏
页码:360 / +
页数:2
相关论文
共 50 条
  • [1] Hybrid Clustering by Integrating Text and Citation based Graphs in Journal Database Analysis
    Liu, Xinhai
    Yu, Shi
    Moreau, Yves
    Janssens, Frizo
    De Moor, Bart
    Glaenzel, Wolfgang
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 521 - +
  • [2] Hybrid clustering by integrating text and citation based graphs in journal database analysis
    Dept. of Electrical Engineering, K.U. Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium
    不详
    [J]. ICDM Workshops - IEEE Int. Conf. Data Min., (521-526):
  • [3] Text Mining and Clustering Analysis
    Raskar, Shobha S.
    Thakore, D. M.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2011, 11 (06): : 203 - 207
  • [4] Text Mining for Translational Bioinformatics
    Dai, Hong-Jie
    Wei, Chih-Hsuan
    Kao, Hung-Yu
    Liu, Rey-Long
    Tsai, Richard Tzong-Han
    Lu, Zhiyong
    [J]. BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [5] Text Mining in Bioinformatics: Research and Application
    Qi, Yanliang
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2013, 3 (02) : 30 - 39
  • [6] Clustering analysis of vulnerability information based on text mining
    School of Information Science and Technology, Northwest University, Xi'an
    710069, China
    [J]. Dongnan Daxue Xuebao, 5 (845-850):
  • [7] A self-organising hybrid model for dynamic text clustering
    Hung, C
    Wermter, S
    [J]. RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XX, 2004, : 141 - 154
  • [8] Mining Text Enriched Heterogeneous Citation Networks
    Kralj, Jan
    Valmarska, Anita
    Robnik-Sikonja, Marko
    Lavrac, Nada
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I, 2015, 9077 : 672 - 683
  • [9] Text Mining for Bioinformatics: State of the Art Review
    Qi, Yanliang
    Zhang, Yang
    Song, Min
    [J]. 2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 1, 2009, : 398 - +
  • [10] Chapter 16: Text Mining for Translational Bioinformatics
    Cohen, K. Bretonnel
    Hunter, Lawrence E.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (04)