Experiments in text-based mining and analysis of biological information from MEDLINE on functionally-related genes

被引:1
|
作者
Moon, N [1 ]
Singh, R [1 ]
机构
[1] San Francisco State Univ, Dept Comp Sci, San Francisco, CA 94132 USA
关键词
EXPRESSION;
D O I
10.1109/ICSENG.2005.41
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Technological advancements such as microarrays have enabled biologists to generate unprecedented quantities of data about biological entities. This has lead to the development of a large number of algorithms for processing and analysis of biological data. Challenges however remain; for instance, genes that function cooperatively need not have similar expression patterns. This suggests the use of non-numerical sources of information to explore the underlying biology. We experimentally study various factors that are inherent in algorithmic methodologies for text analysis. The proposed method accesses MEDLINE dynamically to account for the latest research, with the available literature corresponding to the genes analyzed to develop lists of keywords. Natural language processing (NLP) techniques such as stop-word filtering and stemming are then applied to the lists, and keyword frequencies weighted using the term frequency-inverse document frequency (TFIDF) scheme. The results are input to a hierarchical clustering algorithm to derive groupings of genes by functionality. The process is repealed using z-score weighting and latent semantic analysis (LSA) to determine which yields the most accurate clustering. The study presented examines the importance of these steps and their influence on the overall efficacy of the system. We believe that the analysis conducted as part of this research will be invaluable to development and fine-tuning of text mining methodologies for biological literature.
引用
收藏
页码:326 / 331
页数:6
相关论文
共 50 条
  • [41] Metaverse-related perceptions and sentiments on Twitter: evidence from text mining and network analysis
    Guenduez, Ugur
    Demirel, Sadettin
    ELECTRONIC COMMERCE RESEARCH, 2023,
  • [42] TRANSPORT ANALYSIS APPROACH BASED ON BIG DATA AND TEXT MINING ANALYSIS FROM SOCIAL MEDIA
    Serna, Ainhoa
    Gasparovic, Slaven
    XIII CONFERENCE ON TRANSPORT ENGINEERING, CIT2018, 2018, 33 : 291 - 298
  • [43] Multimodal Sentiment Analysis of Online Product Information Based on Text Mining Under the Influence of Social Media
    Zeng, Xiao
    Zhong, Ziqi
    JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING, 2022, 34 (08)
  • [44] Graph-based information diffusion method for prioritizing functionally related genes in protein-protein interaction networks
    Minh Pham
    Lichtarge, Olivier
    PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020, 2020, : 439 - 450
  • [45] Mining and expression analysis of color related genes in Bougainvillea glabra bracts based on transcriptome sequencing
    Wang, Fei
    Yao, Guoqiong
    Li, Jianyun
    Zhu, Wen
    Li, Zihan
    Sun, Zhenghai
    Xin, Peiyao
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [46] Protocol to describe the analysis of text-based communication in medical records for patients discharged from intensive care to hospital ward
    Leigh, Jeanna Parsons
    Brown, Kyla
    Buchner, Denise
    Stelfox, Henry T.
    BMJ OPEN, 2016, 6 (07):
  • [47] To See the Invisible: An Empirical Comparison of Methods for Text-Based Sentiment Analysis of Online Contents From People With Autism Spectrum Condition
    Ko, Kuan-Chou
    Liu, Shian-Ko
    Wei, Chih-Ping
    Hsieh, Jia-Shiuan
    Yang, Ren-Han
    INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2023, 65 (04) : 402 - 422
  • [48] Text mining analysis of radiological information from newspapers as compared with social media on the Fukushima nuclear power plant accident
    Kanda, Reiko
    Tsuji, Satsuki
    Yonehara, Hidenori
    Journal of Disaster Research, 2014, 9 : 690 - 698
  • [49] Text Mining Analysis of Radiological Information from Newspapers as Compared with Social Media on the Fukushima Nuclear Power Plant Accident
    Kanda, Reiko
    Tsuji, Satsuki
    Yonehara, Hidenori
    JOURNAL OF DISASTER RESEARCH, 2014, 9 : 690 - 698
  • [50] RNA-Seq analysis of yak ovary: improving yak gene structure information and mining reproduction-related genes
    DaoLiang Lan
    XianRong Xiong
    YanLi Wei
    Tong Xu
    JinCheng Zhong
    XiangDong Zhi
    Yong Wang
    Jian Li
    Science China Life Sciences, 2014, 57 : 925 - 935