Abstract
Abrupt and irreversible changes (or tipping points) are decisive in the progression of biological processes. To detect tipping points from gene expression profiles, however, longitudinal data required by classic approaches are usually unavailable. When systems have discrete phenotypes with characteristic transcriptomes, we can adopt tipping-point models to sample ensembles in order. Here, we developed a robust framework, BioTIP, to address two computational impediments: detection of tipping-points accurately, and identification of non-chaotic critical transition signals (CTSs). In the first step, we refined correlation-based tipping-point models by a shrinkage estimation of large-scale correlation matrices. Secondly, non-random CTS identification was advanced by new gene selection, network graph-based clustering, and rigorous evaluation. Validating these approaches, we applied BioTIP to disease and normal developmental systems, covering bulk-cell and single-cell transcriptomes. We identified temporal features of gene-regulatory-network dynamics for phenotypically-defined tipping points, which can be exploited to infer the role of key transcription factors. Additional exploration of the critical transitions in discrete systems can be tested using BioTIP.We designed the BioTIP workflow in the following five steps (Fig 1). In this workflow, two steps (Steps 2 and 5) calculated the random scores from randomly selected genes. In step 2, the distribution of random scores is designed to predict the potential tipping point. The rationale is that random genes can, in cases, capture the “symmetry-breaking destabilization” at tipping points (Mojtahedi 2016). In step 5, the random scores are used to validate the significance of the predicted critical transition signals (CTSs). A CTS measures the loss of resilience of previous states and the gain of instability during transitions (Scheffer, 2012). Because a CTS could be both “regulated” or “chaotic”, the detection of non-random CTSs is important and has successes in development and diseases (Chen, 2012, Sarkar, 2019). The same R function (Table 1) can be used to calculate the random scores, for which we demonstrate the implication of Step 5 in the two following sections.