INSPEcT 1.28.0
The life cycle of RNAs is composed of three main steps, i.e. transcription and processing of the premature RNA (\(P\)) and degradation of the mature (\(M\)). The kinetic rates governing these steps define the dynamics of each transcript (\(k_{1-3}\) for synthesis, processing and degradation, respectively), and their role in transcriptional regulation is often underestimated. A complete understanding of the effects of the rates of the RNA life-cycle on premature and mature RNA requires mathematical and/or computer skills to solve the corresponding system of differential equations:
\[\begin{equation}\label{eq:modelsystem} \left\{ \begin{array}{l l} \dot{P}=k_1 - k_2 \, \cdot \, P \\ \dot{M}=k_2 \, \cdot \, P - k_3 \, \cdot \, M \end{array} \right. \end{equation}\]
This system of differential equations is used by INSPEcT to estimate the rates of the RNA-life cycle when transctiptomic data and (possibly) newly-synthesized RNA are available. INSPEcT aims at assessing the dynamics of each gene by modeling the temporal behavior of the RNA kinetic rates with either constant or variable functions.
In order to visualize and interact with output of the modelig procedure of INSPEcT, and to facilitate the understanding of the impact of RNA kinetic rates on the dynamics of premature and mature RNA, we developed a Graphical User Interface (GUI). Specifically, the GUI allows to:
Importantly, we developed two wrapper functions (inspectFromBAM and inspectFromPCR, see INSPEcT vignette for more details), which streamline the generation of novel INSPEcT datasets, to be uploaded in the GUI.
The GUI is distributed within the INSPEcT package, and starts with the following command line operations:
library(INSPEcT)
runINSPEcTGUI()
The GUI is divided into 4 sections (Fig. 1):
At startup, the software loads a predefined INSPEcT object, which contains 10 genes and can be used to explore the software functionalities. This object can be replaced by any INSPEcT dataset previously saved in the “rds” format (“Choose INSPEcT file”, Fig. 2). Genes that are part of the INSPEcT object are divided according to their regulation class. This is encoded by a string where letters representing the step(s) of the RNA life-cycle that are regulated (‘s’ for synthesis, ‘p’ for processing and ‘d’ for degradation) are concatenated. For example: “p” represents a gene only regulated in its processing rate, “sd” a gene regulated in its synthesis and degradation rates. When no rates are identified as regulated the corresponding class is named “no-reg”. Once a regulation class is selected via “Select class”, a specific gene can be chosen from the list that appears in “Select gene” (Fig. 2). Experimental profiles might be smoothed to reduce the noise associated with this kind of data (“Smooth experimental data” in “Select input”, Fig. 2). Nonetheless, raw experimental data are selected by default (“Raw experimental data” in “Select input”, Fig. 2). The “User defined” mode in “Select input” will be covered in section 4.
For the selected gene, the experimental quantifications of the premature and mature RNA levels (estimated from RNA-seq data) are plotted together with their standard deviations (Fig. 3). If nascent RNA has been profiled, the rate of synthesis is also considered part of the experimental data (since it directly dereives from nascent RNA profiling), and it is plotted with its standard deviation. Otherwise, the rate of synthesis is inferred from total RNA-seq data and lacks the standard deviation. The results of the INSPEcT modeling are plotted with continuous lines within the synthesis, pre-RNA, processing, mature RNA and degradation panels (Fig. 3), and can be downloaded in PDF (image) or TSV (tabular) formats. Below the plot panel, the visualization options allow to:
The minimization status corresponding to the modeling is reported in section 3 of the GUI (Fig. 4). In particular, the p-value associated to the goodness-of-fit statistic and the Akaike information criterion indicate the ability of a model to explain the experimental observations. Both these metrics are penalized for the complexity of the model, meaning that they measure a trade-off between its performance and its simplicity, and they can be used to compare models with different complexity. The complexity of a model depends on the functional forms that describe the RNA life-cycle kinetic rates: a constant rate has a complexity of 1, a sigmoidal 4, and impulsive 6, i.e. the number of their parameters. In practice, when two models explain the data adequately well, the simpler one is selected (lower p-value of the goodness-of-fit statistic and lower AIC). Additionally, the goodness-of-fit p-value is used to assess whether the model under consideration adequately explains the experimental data (e.g. p<0.05). Finally, the minimization status is reported, i.e. whether the minimization converged to a local minimum or not. Supplementary iterations can be provided to identify a better minimum, using either Nelder Mead method (NM, used from INSPEcT) or the quasi-Newton BFGS method (button “Optimization - Run”).
Parameters of the modeling can be directly modified “by hand”. In fact, for each kinetic rate of the RNA life-cycle, the parameters describing the selected functional form are provided in the right part of the GUI (Fig. 5). Constant rates are described by a single parameter, which correspond to the value of the rates throughout the time-course. Rather, variable rates can be described either by sigmoid or impulse functions. Sigmoids are S-shaped functions described by four parameters: starting levels, final levels, time of transition between starting and final levels, and slope of the response. Impulse functions allow more complex behaviors with two additional parameters that describe time and levels of a second transition, possibly encoding for bell-shaped responses. The range of the sliders for starting and final levels can be set for each rate (“set min” and “set max”), giving full flexibility in the setting of rates levels. At startup, these ranges are set to cover the range of all parameters of the example dataset. Each time a new dataset is loaded, ranges are updated accordingly.