Identity Factors
Cell Identity Factors (CIF)
Num. CIF
Length of the Cell Identity Factors (CIF) vector, which can be considered as a latent representation of a cell's state and identity. The Gene Identity Vector (GIV) for each gene will also be the same length.
diff-CIF Proportion
A CIF vector is divided into diff and non-diff-CIF. diff-CIF encodes cell type information, while non-diff-CIF encodes inherent cell heterogenity (i.e. randomly sampled). The more diff-CIF you have, the cell types (clusters or trajectories) will be more prominent in the result.
non-diff-CIF Distribution
μ σ
The values in non-diff-CIF will be sampled from the Gaussian distribution N(μ,σ).
Gene Identity Vectors (GIV) Distribution
μ σ p
The values in GIV will be sampled from this distribution: With prob. p, the value is zero; with prob. 1-p, the value is sampled from a Gaussian distribution N(μ,σ).
Cell Population
Num. Cells
Population Type
For continuous population, the cells will be sampled along the cell differentiation tree.
Cell Differentiation Tree
Specify the differentiation relationship between cell types. scMultiSim provides three pre-defined trees: Phyla1, Phyla3, and Phyla5, or you may input the expression of an R phylo tree object.
Preview
Use an alternative impulse model to sample along the tree
For discrete population, the cells will be sampled from the leaves of the cell differential tree. This GUI will generate a corresponding tree according to the number of clusters you specified.
Num. Clusters
Smallest Cluster Size
Let scMultiSim decide the cluster sizes, but one cluster will be small and will have this many cells, which represents a rare cell type.
Cluster Size
Overrides "Smallest Cluster Size". Specify the size of each cluster manually, e.g. "100,100,200,100" for 500 cells and 4 clusters. The number of clusters should match the provided tree, and the total number of cells should match the specified number of cells.
Genes
GRN
GRN Data
GRN Dataframe
Preview
Specify an existing R dataframe here. The dataframe should have three columns: target, regulator, and effect. target and regulator can be an integer representing the gene ID (recommended), or a string representing the gene name.
Num. Genes
Total number of genes. Should be larger than the genes in the GRN. Additional genes are sequentially named (e.g. "gene_1") and will be randomly sampled.
Num. Genes
Gene Regulatory Network (GRN) is disabled.

Chromatin Regions
Region Identity Vectors (RIV) Distribution
μ σ p
The values in RIV will be sampled from this distribution: With prob. p, the value is zero; with prob. 1-p, the value is sampled from a Gaussian distribution N(μ,σ).
Gene-Region Relationship
Region Distribution
How many genes are associated with 0, 1, or 2 consecutive chromatin regions, respectively. Should sum up to 1.
ATAC Data
ATAC Effect
ATAC Effect
How strong the chromatin accessibility is associated with the gene expression (0-1).
scATAC-seq Data Distribution
ATAC Prob. Zero
The proportion of zeros we see in the ATAC-seq data.
Custom Density
General Settings
Random Seed
Speed Optimization (Experimental)
Parameter Adjustment
Scale `s`
Scale the `s` kinetic parameter to increase or decrease the overall gene expression. When using discrete population, it can be multiple numbers specifying the scale for each cluster, e.g. "1,2,1,1".
Bimodality
Adjust the bimodality of gene expression, thus controlling intrinsic variation (0-1).
Kinetic Model
RNA Velocity Simulation
β
d
β is the splicing rate of each gene in the kinetic model, while d is the degradation rate.
Num. Cycles
Cycle Length
Cell cycles when running the kinetic model.
RNA counts will be sampled from the Beta-Poisson distribution.
Intrinsic Noise
The weight assigned to the random sample from the Beta-Poisson distribution. When smaller than 1, the RNA counts will also contain the Beta-Poisson mean value (less noisy) with weight (1 - intrinsic.noise).
Layout Preview
Layout
Grid Size
The grid width and height of the spatial layout. Should be large enough to fit all cells.
Step Size
If using continuous population, use this step size to further divide the cell types on the tree. For example, if the tree only has one branch 1 -> 2 and the branch length is 1 while the step size is 0.34, there will be totally three cell types: 1_2_1, 1_2_2, 1_2_3.
Spatial Layout
Layout Type
Same Type Prob.
When placing a new cell, the probability of placing it next to a cell of the same type.
Island Cell Types
The cell types in the islands. Should be a comma-separated list.
Cell-Cell Interaction
Max Num. Neighbors
Max number of neighbors a cell will interact with.
CCI Dataframe
Specify an existing R dataframe here. The dataframe should have three columns: target, regulator, and effect. target and regulator can be an integer representing the gene. effect is the strength of the interaction. For example:
data.frame(
    target    = c(101, 102),
    regulator = c(103, 104),
    effect    = c(5.2, 5.9)
)
                                    
Simulates two conditions by modifying some CIF entries using the mod.cif.giv option. It will output two option sets, params1 and params2.
Num. CIF to Change
How many CIF entries to change for each cell. The CIF entries will be randomly selected.
Noise Strength
How much noise to add to the selected CIF entries. The noise will be sampled from a Gaussian distribution.
Simulates spatial domains by defining
Num. CIF to Change
How many CIF entries to change for each cell. The CIF entries will be randomly selected.
Noise Strength
How much noise to add to the selected CIF entries. The noise will be sampled from a Gaussian distribution.