SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data

被引:0
|
作者
Ji, Boya [1 ]
Wang, Xiaoqi [2 ]
Qiao, Debin [3 ,4 ]
Xu, Liwen [1 ]
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian 710000, Peoples R China
[3] Zhengzhou Univ, Sch Comp & Artificial Intelligence, Zhengzhou 450001, Peoples R China
[4] Zhengzhou Univ, Natl Supercomp Ctr Zhengzhou, Zhengzhou 450001, Peoples R China
来源
BIG DATA MINING AND ANALYTICS | 2024年 / 7卷 / 04期
基金
中国国家自然科学基金;
关键词
Accuracy; Large language models; Transcriptomics; Data visualization; Receivers; Spatial databases; Biology; Reliability; Spatial resolution; Signal resolution; Large Language Models (LLM); spatial transcriptome data; Cell-Cell Communications (CCCs); functional gene interaction networks; unified latent space;
D O I
10.26599/BDMA.2024.9020056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC.
引用
收藏
页码:1129 / 1147
页数:19
相关论文
共 50 条
  • [41] scEpath: energy landscape-based inference of transition probabilities and cellular trajectories from single-cell transcriptomic data
    Jin, Suoqin
    MacLean, Adam L.
    Peng, Tao
    Nie, Qing
    BIOINFORMATICS, 2018, 34 (12) : 2077 - 2086
  • [42] Language model-based B cell receptor sequence embeddings can effectively encode receptor specificity
    Wang, Meng
    Patsenker, Jonathan
    Li, Henry
    Kluger, Yuval
    Kleinstein, Steven H.
    NUCLEIC ACIDS RESEARCH, 2024, 52 (02) : 548 - 557
  • [43] CPPLS-MLP: a method for constructing cell-cell communication networks and identifying related highly variable genes based on single-cell sequencing and spatial transcriptomics data
    Zhang, Tianjiao
    Wu, Zhenao
    Li, Liangyu
    Ren, Jixiang
    Zhang, Ziheng
    Wang, Guohua
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [44] A Dynamic Model for Diauxic Growth, Overflow Metabolism, and Al-2-Mediated Cell-Cell Communication of Salmonella Typhimurium Based on Systems Biology Concepts
    Cappuyns, Astrid M.
    Bernaerts, Kristel
    Vanderleyden, Jos
    Van Impe, Jan F.
    BIOTECHNOLOGY AND BIOENGINEERING, 2009, 102 (01) : 280 - 293
  • [45] THRUST MEASUREMENT MODEL-BASED CORRECTION SYSTEM FOR TURBINE ENGINE TEST CELL DYNAMIC DATA
    Palmer, Carl
    Hettler, Eric
    PROCEEDINGS OF THE ASME TURBO EXPO: TURBINE TECHNICAL CONFERENCE AND EXPOSITION, 2015, VOL 6, 2015,
  • [46] Model-based autoencoders for imputing discrete single-cell RNA-seq data
    Tian, Tian
    Min, Martin Renqiang
    Wei, Zhi
    METHODS, 2021, 192 : 112 - 119
  • [47] Electrochemical cell prognostics using online impedance measurements and model-based data fusion techniques
    Kozlowski, JD
    2003 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-8, 2003, : 3257 - 3270
  • [48] Model-based determination of changing kinetics in high cell density cultures using respiration data
    Drews, Anja
    Arellano-Garcia, Harvey
    CHEMICAL ENGINEERING SCIENCE, 2008, 63 (19) : 4789 - 4799
  • [49] Model-based cell clustering and population tracking for time-series flow cytometry data
    Minoura, Kodai
    Abe, Ko
    Maeda, Yuka
    Nishikawa, Hiroyoshi
    Shimamura, Teppei
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [50] A unified model-based framework for doublet or multiplet detection in single-cell multiomics data
    Hu, Haoran
    Wang, Xinjun
    Feng, Site
    Xu, Zhongli
    Liu, Jing
    Heidrich-O'Hare, Elisa
    Chen, Yanshuo
    Yue, Molin
    Zeng, Lang
    Rong, Ziqi
    Chen, Tianmeng
    Billiar, Timothy
    Ding, Ying
    Huang, Heng
    Duerr, Richard H.
    Chen, Wei
    NATURE COMMUNICATIONS, 2024, 15 (01)