A data-driven approach for constructing mutation categories for mutational signature analysis

被引:1
|
作者
Gilad, Gal [1 ]
Leiserson, Mark D. M. [2 ,3 ]
Sharan, Roded [2 ,3 ]
机构
[1] Tel Aviv Univ, Sch Comp Sci, Tel Aviv, Israel
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
关键词
Diagnosis;
D O I
10.1371/journal.pcbi.1009542
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.</p> Author summary Cancer is a group of genetic diseases that occur as a result of an accumulation of somatic mutations in genes that regulate cellular growth and differentiation. These mutations arise from mutagenic processes such as exposure to environmental mutagens and defective DNA damage repair pathways. Each of these processes results in a characteristic pattern of mutations, referred to as a mutational signature. These signatures reveal the mutagenic mechanisms that have influenced the development of a specific tumor, and thus provide new insights into its causes and potential treatments. Originally, a mutational signature has been defined using 96 mutation categories that take into account solely the information from the mutated base and its flanking bases. Here, we aim to challenge this arbitrary categorization, which is widely used in mutational signature analysis. We have developed a novel framework for the construction of mutation categories that is based on the assumption that the activities of DNA damage repair genes are correlated with the mutational processes that are active in a given tumor. We show that using this approach we are able to identify an alternative mutation categorization that outperforms the standard categorization with respect to multiple metrics. This categorization includes categories that account for bases that extend beyond the immediate flanking bases, suggesting that mutational signatures should be studied in broader sequence contexts.</p>
引用
收藏
页数:15
相关论文
共 50 条
  • [1] DaMAT: A Data-driven Mutation Analysis Tool
    Vigano, Enrico
    Cornejo, Oscar
    Pastore, Fabrizio
    Briand, Lionel
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 165 - 169
  • [2] A data-driven approach to constructing an ontological concept hierarchy based on the formal concept analysis
    Hwang, Suk-Hyung
    Kim, Hong-Gee
    Kim, Myeng-Ki
    Choi, Sung-Hee
    Yang, Hae-Sool
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 4, 2006, 3983 : 937 - 946
  • [3] A Signature-based Approach for Data-driven Analysis of the Inter-modal Demand Dynamics
    Benam, Ali Shateri
    Furno, Angelo
    El Faouzi, Nour-Eddin
    [J]. 2023 8TH INTERNATIONAL CONFERENCE ON MODELS AND TECHNOLOGIES FOR INTELLIGENT TRANSPORTATION SYSTEMS, MT-ITS, 2023,
  • [4] Constructing Data-Driven Personas through an Analysis of Mobile Application Store Data
    Park, Daehee
    Kang, Jeannie
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (06):
  • [5] Adoption of IT solutions: A data-driven analysis approach
    Reinhartz-Berger, Iris
    Hartman, Alan
    Kliger, Doron
    [J]. INFORMATION SYSTEMS, 2024, 120
  • [6] The Informativity Approach: To Data-Driven Analysis and Control
    Van Waarde, Henk J.
    Eising, Jaap
    Camlibel, M. Kanat
    Trentelman, Harry L.
    [J]. IEEE Control Systems, 2023, 43 (06) : 32 - 66
  • [7] A data-driven approach for constructing the component-failure mode matrix for FMEA
    Zhaoguang Xu
    Yanzhong Dang
    Peter Munro
    Yuhang Wang
    [J]. Journal of Intelligent Manufacturing, 2020, 31 : 249 - 265
  • [8] A data-driven approach for constructing the component-failure mode matrix for FMEA
    Xu, Zhaoguang
    Dang, Yanzhong
    Munro, Peter
    Wang, Yuhang
    [J]. JOURNAL OF INTELLIGENT MANUFACTURING, 2020, 31 (01) : 249 - 265
  • [9] Data-Driven Mutation Analysis for Cyber-Physical Systems
    Vigano, Enrico
    Cornejo, Oscar
    Pastore, Fabrizio
    Briand, Lionel C.
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 2182 - 2201
  • [10] Data-Driven STFT for UAV Micro-Doppler Signature Analysis
    Herr, Daniel B.
    Tahmoush, Dave
    [J]. 2020 IEEE INTERNATIONAL RADAR CONFERENCE (RADAR), 2020, : 1023 - 1028