The Text-mining based PubChem Bioassay neighboring analysis

被引:21
|
作者
Han, Lianyi [1 ]
Suzek, Tugba O. [1 ]
Wang, Yanli [1 ]
Bryant, Steve H. [1 ]
机构
[1] US Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 USA
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
PROTEIN-PROTEIN INTERACTIONS; BIOMEDICAL LITERATURE; GENE-EXPRESSION; INFORMATION; NETWORK; NAMES;
D O I
10.1186/1471-2105-11-549
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In recent years, the number of High Throughput Screening (HTS) assays deposited in PubChem has grown quickly. As a result, the volume of both the structured information (i.e. molecular structure, bioactivities) and the unstructured information (such as descriptions of bioassay experiments), has been increasing exponentially. As a result, it has become even more demanding and challenging to efficiently assemble the bioactivity data by mining the huge amount of information to identify and interpret the relationships among the diversified bioassay experiments. In this work, we propose a text-mining based approach for bioassay neighboring analysis from the unstructured text descriptions contained in the PubChem BioAssay database. Results: The neighboring analysis is achieved by evaluating the cosine scores of each bioassay pair and fraction of overlaps among the human-curated neighbors. Our results from the cosine score distribution analysis and assay neighbor clustering analysis on all PubChem bioassays suggest that strong correlations among the bioassays can be identified from their conceptual relevance. A comparison with other existing assay neighboring methods suggests that the text-mining based bioassay neighboring approach provides meaningful linkages among the PubChem bioassays, and complements the existing methods by identifying additional relationships among the bioassay entries. Conclusions: The text-mining based bioassay neighboring analysis is efficient for correlating bioassays and studying different aspects of a biological process, which are otherwise difficult to achieve by existing neighboring procedures due to the lack of specific annotations and structured information. It is suggested that the text-mining based bioassay neighboring analysis can be used as a standalone or as a complementary tool for the PubChem bioassay neighboring process to enable efficient integration of assay results and generate hypotheses for the discovery of bioactivities of the tested reagents.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] The Text-mining based PubChem Bioassay neighboring analysis
    Lianyi Han
    Tugba O Suzek
    Yanli Wang
    Steve H Bryant
    [J]. BMC Bioinformatics, 11
  • [2] Text-mining based journal splitting
    Lin, XF
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1075 - 1079
  • [3] A text-mining analysis of the human phenome
    Marc A van Driel
    Jorn Bruggeman
    Gert Vriend
    Han G Brunner
    Jack A M Leunissen
    [J]. European Journal of Human Genetics, 2006, 14 : 535 - 542
  • [4] A text-mining analysis of the human phenome
    van Driel, MA
    Bruggeman, J
    Vriend, G
    Brunner, HG
    Leunissen, JA
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2006, 14 (05) : 535 - 542
  • [5] Text-Mining and Neuroscience
    Ambert, Kyle H.
    Cohen, Aaron M.
    [J]. BIOINFORMATICS OF BEHAVIOR: PART 1, 2012, 103 : 109 - 132
  • [6] Lightweight Search Engine Based on Text-Mining
    Liu, Chao
    Yin, Shi Qun
    Sun, Meng Meng
    Gao, Sheng
    [J]. FUZZY SYSTEM AND DATA MINING, 2016, 281 : 264 - 270
  • [7] Drug repurposing: A bibliometric analysis by text-mining PubMed
    Baker, Nancy
    Ekins, Sean
    Williams, Antony
    Tropsha, Alexander
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [8] Green IT Practices across Industries: A Text-Mining based
    Deng, Qi
    Ji, Shaobo
    Wang, Yun
    [J]. AMCIS 2017 PROCEEDINGS, 2017,
  • [9] Text-Mining the Voice of the People
    Evangelopoulos, Nicholas
    Visinescu, Lucian
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 55 - 62
  • [10] Maximizing text-mining performance
    Weiss, SM
    Apte, C
    Damerau, FJ
    Johnson, DE
    Oles, FJ
    Goetz, T
    Hampp, T
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (04): : 63 - 69