Dynamic Hybrid Clustering of Bioinformatics by Incorporating Text Mining and Citation Analysis

被引:0
|
作者
Janssens, Frizo [1 ]
Glanzel, Wolfgang [2 ]
De Moor, Bart [1 ]
机构
[1] Katholieke Univ Leuven, Elect Engn ESAT, Kasteelpk Arenberg 10, B-3001 Leuven, Belgium
[2] Katholieke Univ Leuven, Steunpunt O&O Indicatoren, B-3000 Louvain, Belgium
关键词
Fisher's inverse chi-square method; cluster chains;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1981-2004. For delineating this complex, interdisciplinary field, a novel bibliometric retrieval strategy is used. Given that the performance of unsupervised clustering and classification of scientific publications is significantly improved by deeply merging textual contents with the structure of the citation graph, we proceed with a hybrid clustering method based on Fisher's inverse chi-square. The optimal number of clusters is determined by a compound semiautomatic strategy comprising a combination of distance-based and stability-based methods. We also investigate the relationship between number of Latent Semantic Indexing factors, number of clusters, and clustering performance. The HITS and PageRank algorithms are used to determine representative publications in each cluster. Next, we develop a methodology for dynamic hybrid clustering of evolving bibliographic data sets. The same clustering methodology is applied to consecutive periods defined by time windows on the set, and in a subsequent phase chains are formed by matching and tracking clusters through time. Term networks for the eleven resulting cluster chains present the cognitive structure of the field. Finally, we provide a view on how much attention the bioinformatics community has devoted to the different subfields through time.
引用
收藏
页码:360 / +
页数:2
相关论文
共 50 条
  • [21] Trend Analysis of Machine Learning - A Text Mining And Document Clustering Methodology
    Yang, Jiann-Min
    Wu, Wen-Chin
    Liao, Wei-Cheng
    Yin, Chi-Yen
    2009 INTERNATIONAL CONFERENCE ON NEW TRENDS IN INFORMATION AND SERVICE SCIENCE (NISS 2009), VOLS 1 AND 2, 2009, : 481 - 486
  • [22] Open Agile text mining for bioinformatics: the PubAnnotation ecosystem
    Kim, Jin-Dong
    Wang, Yue
    Fujiwara, Toyofumi
    Okuda, Shujiro
    Callahan, Tiffany J.
    Cohen, K. Bretonnel
    BIOINFORMATICS, 2019, 35 (21) : 4372 - 4380
  • [23] A Grid Infrastructure for Mixed Bioinformatics Data and Text Mining
    Ghanem, Moustafa
    Chortaras, Alexandros
    Guo, Yike
    Rowe, Anthony
    Ratcliffe, Jon
    3RD ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, 2005, 2005,
  • [24] INTEGRATING TEXT MINING AND CITATION ANALYSIS IN THE DECISION-MAKING PROCESS FOR LIBRARY COLLECTIONS
    Illescas, L.
    Sucozhanay, D.
    Siguenza-Guzman, L.
    12TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE (INTED), 2018, : 7450 - 7456
  • [25] Monitoring and forecasting the development trends of nanogenerator technology using citation analysis and text mining
    Li, Xin
    Fan, Mingjie
    Zhou, Yuan
    Fu, Jing
    Yuan, Fei
    Huang, Lucheng
    NANO ENERGY, 2020, 71 (71)
  • [26] Identification of Key Genes and Molecular Pathways in Keratoconus: Integrating Text Mining and Bioinformatics Analysis
    Hu, Di
    Lin, Zenan
    Jiang, Junhong
    Li, Pan
    Zhang, Zhehuan
    Yang, Chenhao
    BIOMED RESEARCH INTERNATIONAL, 2022, 2022
  • [27] Clustering legal artifacts using text mining
    Lachana, Zoi
    Loutsaris, Michalis Avgerinos
    Alexopoulos, Charalampos
    Charalabidis, Yannis
    14TH INTERNATIONAL CONFERENCE ON THEORY AND PRACTICE OF ELECTRONIC GOVERNANCE (ICEGOV 2021), 2021, : 65 - 70
  • [28] Weighted Hybrid Clustering by Combining Text Mining and Bibliometrics on a Large-Scale Journal Database
    Liu, Xinhai
    Yu, Shi
    Janssens, Frizo
    Glanzel, Wolfgang
    Moreau, Yves
    De Moor, Bart
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (06): : 1105 - 1119
  • [29] Text Mining with Hybrid Biclustering Algorithms
    Orzechowski, Patryk
    Boryczko, Krzysztof
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, (ICAISC 2016), PT II, 2016, 9693 : 102 - 113
  • [30] Rough Text assisting text mining: Focus on document clustering validity
    Arco, Leticia
    Bello, Rafael
    Caballero, Yaile
    Falcon, Rafael
    GRANULAR COMPUTING: AT THE JUNCTION OF ROUGH SETS AND FUZZY SETS, 2008, 224 : 229 - +