---
title: "An introduction to ELAplus: Energy Landscape Analysis"
author: "Sotaro Takano, Kenta Suzuki, Hiroaki Fujita"
output: 
  rmarkdown::html_vignette:
    toc: true
    number_sections: true
vignette: >
  %\VignetteIndexEntry{An introduction to ELAplus: Energy Landscape Analysis(eval=FALSE)}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{=html}
<style>
body {
text-align: justify}
</style>
```

```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(
  fig.align = "center",
  out.extra = 'style="max-width:100%; overflow-x:auto; white-space: nowrap;"',
  tidy = FALSE
)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  message  = FALSE,
  warning  = FALSE
)
```

# Overview

ELAplus provides tools for Energy Landscape Analysis (ELA), a framework for
analyzing dynamical systems represented as weighted networks. The
package is designed primarily for the analysis of ecological community
composition data, but the framework is applicable to other multistable
systems.

The typical workflow consists of:

**- Fitting community composition data to a (possibly extended) pairwise
maximum entropy model**

**- Constructing an energy landscape from the fitted model**

**- Identifying stable states and transition structures**

**- Visualizing the resulting landscape and interactions**

The theoretical background is described in [Suzuki et al.
(2021)](https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecm.1469).

# Example: Basic Energy Landscape Analysis

This vignette demonstrates a standard ELA workflow using test dataset.

## Input dataset

-   `baseabtable`: **samples** - **species** abundance table (data.frame
    or matrix).
    -   Rows correspond to sampling units (sites, time points, subjects,
        etc.)
    -   Columns correspond to species (or OTUs/ASVs/features/taxa).
-   `basemetadata` (optional): a **samples** - **factors** metadata
    table (data.frame or matrix).
    -   Rows correspond to sampling units (same as those in
        `baseabtable`)
    -   Columns correspond to environmental parameters (e.g.,
        temperature, pH, etc.).

```{r}
# load package
library(ELAplus)
# load test data set
# species abundance table
data("baseabtable")
head(baseabtable)

# metadata (environmental factors)
data("basemetadata")
head(basemetadata)
```

## (Optional) Data generation using a Gibbs sampling (heat-bath algorithm)

The simulated community composition data are also available.

In this example, the data comprising `24` species and `512` samples were
generated through `100` iterations of Gibbs sampling (heat-bath
algorithm).

Both implicit/explicit environmental factors (`h.act` and `g.act`) are
considered. The number of explicit environmental factors is specified by
`ne` (`2` factors in this example).

`j.act` is the interaction matrix.

```{r}
# load package
library(ELAplus)
# Generate random parameters
hb_params <- hb.paramgen(24, ne = 2) # the number of environmental factors
h.act <- hb_params[[1]]
j.act <- hb_params[[2]]
g.act <- hb_params[[3]]

# Simulate community composition data
rand_data <- HeatBath(100, 512, h.act, j.act, g = g.act)
rand_mat   <- rand_data[[1]]
rand_enmat <- rand_data[[2]]

```

## Formatting the dataset

ELAplus expects the community data in a standardized representation.

`Formatting()` converts raw input tables into aligned occurrence
(binary) matrix and (optionally) environmental factor matrix.

### Inputs

-   `baseabtable`: **samples** - **species** abundance table (data.frame
    or matrix).

-   `basemetadata` (optional): a **samples** - **factors** metadata
    table (data.frame or matrix).

    -   This is used when you want to include environmental parameters
        (e.g., temperature, pH, treatment group).
    -   If you do not have environmental data, pass NULL (or an empty
        object).

```{r}
formatted <- Formatting(baseabtable = baseabtable, basemetadata = basemetadata)
```

### Key preprocessing options

#### Normalization (normalize)

-   If `normalize = 1`, the abundances within each sample are normalized
    so that each row sums to 1. This is useful when:
    -   total sequencing depth differs strongly between samples etc...
-   If `normalize = 0`, the original scale is preserved.

#### Occurrence binarization (the “0/1 threshold”)

-   ELA operates on binary presence/absence states (each species is
    present or absent).
-   `Formatting()` creates the occurrence matrix by applying the first
    element of `parameters`:
-   `parameters[1]` (“0/1 threshold”): a numeric cutoff used to define
    presence.
    -   Example: `parameters[1] = 0.05` means “abundance \> 0.05 is
        present; otherwise absent”.

#### Species filtering by prevalence (min/max occurrence thresholds)

-   Species that are too rare or almost always present typically add
    little information or inflate the results in construction of energy
    landscape.

-   `Formatting()` therefore filters species based on prevalence across
    samples using:

-   `parameters[2]` (“min occurrence threshold”)

-   `parameters[3]` (“max occurrence threshold”)

-   These are interpreted as proportions of samples in which the species
    is present **after binarization**.

-   Example:

    -   `parameters = c(0.01, 0.01, 0.99)` means:
    -   present(1) if abundance \> 0.01,
    -   keep species whose presence frequency is \> 0.01 and \< 0.99.

```{r, eval=FALSE}
formatted <- Formatting(baseabtable = baseabtable, basemetadata = basemetadata,
                        normalize = 1, parameters = c(0.01, 0.01, 0.99))
#> Processed 256 samples.
#> Relative abundance threshold = 0.01
#> Occurrence threshold (lower) = 0.01 
#> Occurrence threshold (upper) = 0.99
#> Selected  16  out of  16 species.
ocvecs <- formatted[[1]]  # community composition binary matrix
abvecs <- formatted[[2]]  # community abundance matrix (not binarized)
envecs <- formatted[[3]]  # environmental variables (optional)
```

```{r, echo=FALSE, out.width="80%"}
knitr::include_graphics("figures/ParameterMatrix.png")
```

## Parameter fitting

### Extended maximum pairwise entropy model

The energy $E(\sigma^{(k)},\varepsilon)$ of each community composition
$\sigma^{(k)}$ is assigned by imposing the potential structure to the
state space by introducing the extended pairwise maximum entropy model.
Briefly, the energy is determined by three parameters. For details,
please see [Appendix].

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/EPMEM_abstract2.png")
```

-   `h`: Implicit (unobserved) environmental parameter for each species

-   `J`: Species-species interaction matrix

-   `g`: Explicit environmental parameters for each species

### Parameter fitting algorithm

Fitting the parameters of the pairwise maximum entropy model is
performed using a stochastic approximation algorithm.

We approximate the intractable model expectation in the likelihood
gradient with a persistent (warm-started) Gibbs/heat-bath chain advanced
by one sweep per parameter iteration, corresponding to persistent
contrastive divergence with one sweep (PCD-1), also known as stochastic
maximum likelihood (SML).

It repeatedly simulates data under the current parameters and then
updates `h`, `J`, and `g` to reduce the discrepancies between observed
vs simulated metrics.

Two observed co-occurrence statistics are used for the fitting.

1.  Species–species co-occurrence: $y = ocvecs^{T}ocvecs$

2.  Environment–species co-occurrence: $y^{env} = envecs^{T}ocvecs$

Then, the algorithm repeatedly updates the parameters by comparing

-   observed vs simulated species–species co-occurrence
    $y_{diff} = y_{observed} - y_{simulated}$
-   observed vs simulated environment–species co-occurrence
    $y^{env}_{diff} = y^{env}_{observed} - y^{env}_{simulated}$

Each parameter independently updated by those metrics:

$$
\Delta{h} \propto diag(y_{diff})
$$ $$
\Delta{J} \propto y_{diff}
$$

$$
\Delta{g} \propto y^{env}_{diff}
$$

## Hyperparameter search for fitting

For simplicity, the usage of ELA without explicit environmental
parameters (`envecs`) is first explained.

Fitting the pairwise maximum entropy model with `runSA()` depends on
several optimization configurations, and the best settings can vary
across datasets (number of species, sample size, sparsity).

ELAplus provides a dedicated grid-search routine to choose hyperparameters
empirically.

`Findbp()` performs a grid search over three key hyperparameters:

\-`lmd` : **a sparsity penalty** to encourage sparse matrix. Larger
values typically lead to fewer nonzero parameters.

\-`we` : **AdamW weight decay**, a form of L2 regularization applied
during optimization.

\-`maxlrs` : **maximum learning rate** in fitting algorithm (SML).


```{r, eval=FALSE}
bpresult <- Findbp(ocvecs, rep = 2, threads = 2,
                   cvmode="10fold",nrepeat = 10, we=c(0.001),totalit=4000,
                   lmd=c(1e-05,1e-03,0.01),maxlrs = c(0.005),
                   tol=0.03, runadamW=TRUE,sparse=TRUE,fastfitting=TRUE)
#Start hyper-parameter search:
#There are 300 tasks
#Finished 300 tasks
#SA: elapsed time 15.08 sec

bp <- bpresult[[1]]  # typically best bpresult
bpm <- as.numeric(unlist(strsplit(names(bp)[1], split = "_")))
allresults <- bpresult[[2]]
```

### What Findbp does

**1. Splits samples into training and test subsets**

-   The data are split using cross-validation based on cvmode. 
-   For "10fold" or "5fold", samples are randomly assigned into k folds (repeated `nrepeat` times), and each fold is used as the test set while the remaining folds are used for training.
-   For "loo" (leave-one-out), each sample is used once as the test set while the rest form the training set.

**2. Fits the model on the training subset for each grid point**

-   For each combination of candidate hyperparameters, the function
    calls `runSA()`.

**3. Evaluates predictive performance on held-out samples**

-   Each fitted model is scored by its predictive error on the test
    subset (lower is better).
-   The best-performing settings are returned.
-   If `fastfitting=TRUE`, the function returns candidates with the
    lowest iterations within `tol` of the best observed error.

### Fitting results can be visualized by `plotSAtest`.

```{r plotSAtest, eval=FALSE}
plotSAtest(allresults)
# "id" corresponds to hyperparameter: lmd_we_maxlr
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/plotSAtest.png")
```

### Fitting the extended maximum pairwise entropy model

The best parameters found by `Findbp()` are used for real analyses by
`runSA()`.

```{r, eval=FALSE}
sa <- runSA(
  ocvecs,
  enmat   = NULL,
  rep     = 32,
  getall  = FALSE,
  lambda  = bpm[[1]],
  we      = bpm[[2]],
  maxlr = bpm[[3]],
  totalit = bpm[[4]]
)
#> Start parameter fitting:
#> .SA: elapsed time 0.13 sec
```

## Constructing the energy landscape

After fitting the (extended) pairwise maximum entropy model with
`runSA()`, the next step is to construct energy landscape over community
compositions using the fitted parameters.

In ELAplus, this is performed by `ELA()`, which identifies:

**1. Stable states** Here, the base of the basin (with minimal energy)
is regarded as a stable state, which is distinct from the equilibrium
point of the dynamical system or the attractor. For details, see [Suzuki
et al.
(2021)](https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecm.1469),

**2. Tipping points (energy barriers connecting pairs of basins/stable
states).** A connecting pair of basins have basin boundary and the point
with the lowest energy barrier is regarded as tipping point here.

The fitted model defines an energy value for any community composition
(typically a binary presence/absence vector of community members).

```{r, eval=FALSE}
ela <- ELA(
  sa,
  env            = NULL,
  threads        = 1,
  SS.itr         = 20000,
  FindingTip.itr = 10000,
  reporting      = TRUE
)
#> Start ELA:
#> 7 stable states were found.
#> Checking 21 tipping points.
#> converting...
#> ELA: elapsed time 2.19 sec
```

## ELA output

After stable states and tipping points are computed, `ELA()` summarizes
and standardizes the outputs. This conversion step produces a compact
representation of the energy landscape structure:

-   [[1]] Stable state IDs (64-base index encoded from community
    composition array by `bin2id()`)
-   [[2]] Stable state energies
-   [[3]] Tipping point ID (for stable-state pairs)
-   [[4]] Tipping energies (corresponding barrier energies)

The output format is compatible to downstream steps such as pruning
(ELPruning) and visualization (e.g., disconnectivity graphs).

```{r, eval=FALSE}
ela[[1]] # stable state IDs
#> [1] "09x" "01t" "EWB" "EY8" "1uV" "1mL" "17s"
ela[[2]] # stable state energies
#> [1] -11.099418 -10.297931  -9.544050  -8.961634  -8.005040  -6.440385  -6.269140
```

### What ELA does

1.  Identifying stable states by repeated steepest descent
    (`SSestimate`)

-   The number of repetitions is controlled by `SS.itr` (default 20000).
-   More iterations increase the chance of discovering all major stable
    states, especially when the number of species is large, landscape
    has many stable states, etc...

2.  Finding tipping points between stable states

-   Once a set of stable states is obtained, `ELA()` evaluates
    transitions between them by searching for tipping points between all
    pairs of stable states.
-   The number of iterations is controlled by `FindingTip.itr`. As with
    `SS.itr`, increasing `FindingTip.itr` improves robustness at the
    cost of runtime. This step can be computationally heavy because it
    is performed across stable-state pairs.

## Pruning stable states

The energy landscape constructed by `ELA()` first identified stable
states as much as possible. Some of these stable states correspond to
very shallow basins.

Such stable states are often sensitive to noise and does not largely
explain dominant multistability of the system.

`ELPruning()` addresses this issue by iteratively pruning shallow basins
and merging them into deeper, neighboring basins.

```{r, eval=FALSE}
elap <- ELPruning(ela, th=0.1, type="relative")
#> Start pruning:
#> *...
#> ELPruning: elapsed time 0.05 sec
```

### Output and interpretation

`ELPruning()` returns a list with two elements:

[[1]] Pruned ELA object (same format as the output of `ELA()`):

-   [[1]] updated stable state IDs,
-   [[2]] updated stable state energies,
-   [[3]] updated tipping point IDs,
-   [[4]] updated tipping energies.

```{r, eval=FALSE}
elap[[1]][[1]] # updated stable state IDs
#> [1] "09x" "EWB" "EY8" "1uV"
elap[[1]][[2]] # updated stable state energies
#> [1] -11.099418  -9.544050  -8.961634  -8.005040
```

[[2]] Stable-state mapping table

-   ss.before.pruning: stable state IDs from the original (unpruned)
    landscape
-   ss.after.pruning: corresponding stable state IDs after pruning

```{r, eval=FALSE}
elap[[2]]
#>  ss.before.pruning ss.after.pruning
#> 1               09x              09x
#> 2               01t              09x
#> 3               EWB              EWB
#> 4               EY8              EY8
#> 5               1uV              1uV
#> 6               1mL              09x
#> 7               17s              09x
```

### Basin depth and pruning criterion

For each stable state, a basin depth is defined as the energy difference
between the energy of the stable state itself and the lowest tipping
point energy connecting that stable state to any other stable state
($\Delta E$).

The basin depth measures how “robust” a stable state is against
transitions: deeper basins correspond to more robust, while shallow
basins correspond to more unstable states.

In `ELPruning()`, there are two methods `relative` or `bootstrap`.

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/deltaE_scheme.png")
```

#### Relative depth pruning

If `type="relative"` is set, `th` (relative depth threshold) specifies
the minimum acceptable energy barrier relative to the deepest basin
barrier. $$
\Delta E_{\mathrm{min}} < th * \Delta E_{\mathrm{max}}
$$ For example, `th = 0.2` means that basins shallower than 20% of the
deepest basin ($\Delta E_{\mathrm{max}}$) are pruned. The results of
pruning is usually sensitive to this value.

```{r, eval=FALSE}
elap <- ELPruning(ela, th=0.2, type="relative")
#> Start pruning:
#> *...
#> ELPruning: elapsed time 0.05 sec
elap[[1]][[1]] # updated stable state IDs
#> [1] "09x" "EWB" "1uV"
```

#### bootstrap samples pruning

Alternatively, users can perform pruning of stable states by calculating
vulnerabilities in bootstrap samples in each stable state.

This method is explained in the next section:[Pruning with bootstrap
resampling].

## Visualization

### Disconnectivity graph

A disconnectivity graph is a compact visualization of the global
structure of an energy landscape.

-   It summarizes how stable states are separated by energy barriers
    (tipping points).
-   The vertical axis represents energy
-   Branches illustrate how stable states are connected and separated by
    tipping points.

```{r showDG, eval=FALSE}
# Before pruning
showDG(ela, ocvecs, "example")
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/DGexample.png")
```

```{r showDG_pruned, eval=FALSE}
# After pruning
showDG(elap[[1]], ocvecs, "example")
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/DGexample_pruned.png")
```

#### How to interpret the graph

-   Lower energies correspond to more stable configurations.
-   Branching structure indicates which basins are “close” in terms of
    energy barriers.
-   Barrier height as a transition difficulty proxy.

#### Optional checking bootstrap resampling

The reliability of stable states can be also examined by the bootstrap
resampling of training data sets (the occurrence matrix). The function,
`bootstrap_SA()`, performs parameter fitting with bootstrap resampling
of the data sets.

`bootstrap_ELA()` uses the results of `bootstrap_SA()` (i.e., estimated
parameters: h, J) for computing the variability of energy gaps between
tipping points and stable states arising from uncertainty due to finite
sample size.

If the estimated stable state is statistically reliable, the estimated
$\Delta E$ is expected to be consistently smaller than 0 across
bootstrap resamples of the training data.

Therefore, the probability of bootstrap samples with $\Delta E < 0$
reflects the statistical significance of the estimated stable state.
Here, this quantity is referred to as a "bootstrap-approximated p-value"
and is used to assess statistical reliability.

As an example, the distribution of $\Delta E$ between "01t" and "09/" is
shown as a histogram. More than 25% of bootstrap samples yield
$\Delta E > 0$ (i.e., p-val \> 0.25), and this case is regarded as
statistically not reliable.

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/deltaE_hist.png")
```

If `bootstrap_ene` is specified in `showDG()`, the function displays an
approximate p-value in each stable state. This option is helpful when
you want to see uncertainty in estimated energies in ELA.

```{r showDG_pvalue, eval=FALSE}
bs_params <- bootstrap_SA(ocvecs, rep = 16, threads=8,　bootstrap.it = 1000, 
                          lambda  = bpm[[1]],we = bpm[[2]],
                          maxlr = bpm[[3]], totalit = bpm[[4]])

bootstrap <- bootstrap_ELA(bs_params,ocvecs,ela=ela,threads = 1)
ela_rep <- bootstrap[[1]]
bootstrap_ene <- bootstrap[[2]]
showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene, min_pval = 0.01, 
       scale_labels = c(0.05,0.1,0.2,0.5))
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/DGexample_pval.png")
```

#### Pruning with bootstrap resampling

The pruning stable states is also possible using the bootstrap results.

The `ELPruning()` with `type="bootstrap"`used the distribution of
$\Delta E$ in bootstrap samples and evaluated whether the predicted
stable states are statistically reliable.

If the p-value is above threshold (e.g., 0.05), the function prunes such
unreliable and shallow basins (stable states) and merges them into
deeper, neighboring basins. The pruned graph is follows.

```{r showDG_IQR_pruned, eval=FALSE}
elap_rep <- ELPruning(ela, type = "bootstrap",bootstrap_ene=bootstrap_ene,lower_qr = 0.25)
showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene, min_pval = 0.01, 
       scale_labels = c(0.05,0.1,0.2,0.5))
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/DGexample_pval_pruned.png")
```

If `getGraohObj=TRUE`, `showDG()` function returns a data frame for
plotting a disconnectivity graph. In the following example,
disconnectivity graph is drawn using `ggplot()`. Tipping points and
stable states are shown as triangles and circles, respectively, and
colored by energies.

```{r showDG_manual, eval=FALSE}
grobj_to_plot <- showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene,getGraphObj = TRUE)
grobj_point <- grobj_to_plot[grobj_to_plot$point_str != "",]
g <- ggplot(
  grobj_to_plot,
  aes(x=.data$nodes2xposi, y=.data$energy, label=.data$point_str)) + 
  xlab("") + geom_text(hjust=0.75, vjust=2, aes(fontface=2)) + geom_path() +
  geom_point(
    data = grobj_point,
    mapping = aes(
      x = .data$nodes2xposi,
      y = .data$energy,
      color = .data$energy,
      shape = .data$type,
    ),size = 4
  ) +
  scale_color_viridis_c(option = "plasma",alpha=0.8,
                        direction=-1,na.value = "grey") +
  coord_cartesian(xlim=c(0.75,7.5), ylim=c(-5.8,-11.5))
plot(g)
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/DG_manual.png")
```

### Species interaction graph

`showIntrGraph()` visualizes the pairwise interaction structure inferred
by the fitted model in a 2D “interaction space”.

It extracts estimated parameters from the resulting interaction matrix
(jest).

-   Each species is plotted as a point at its PCoA coordinates.
-   Links with value \> `th` are classified as positive, while links
    with value \< `-th` are classified as negative.

The plot draws these two link types separately and colors them
differently (positive = red, negative = blue).

```{r ShowIntrGraph, eval=FALSE}
showIntrGraph(
  elap[[1]],
  sa,
  th = 0.01,
  annot_adj = c(0.75, 1.00)
)
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/IntrGraph.png")
```

# Environmental gradient analysis

The stability structure of a community is not fixed, but changes
continuously along an environmental gradient (e.g., temperature,
nutrient availability, disturbance intensity).

`GradELA()` is designed to capture this dependence by estimating energy
landscapes repeatedly across a controlled range of an environmental
factor and summarizing how stable states and transition barriers change.

When environmental variables (`envecs`) are included, ELAplus supports
**Gradient Energy Landscape Analysis** via `GradELA()`. The parameter
fitting process is same as that for normal ELA except for specifying
`enmat`.

```{r, eval=FALSE}
bpresult <- Findbp(ocvecs, enmat = envecs, rep = 2, threads = 2,
                   cvmode="10fold",nrepeat = 10, we=c(0.001),totalit=4000,
                   lmd=c(1e-05,1e-03,0.01),maxlrs = c(0.005),
                   tol=0.03, runadamW=TRUE,sparse=TRUE,fastfitting=TRUE)
bp <- bpresult[[1]]
bpm <- as.numeric(unlist(strsplit(names(bp)[1], split = "_")))

sa <- runSA(
  ocvecs,
  enmat   = envecs,
  rep     = 16,
  getall  = FALSE,
  lambda = bpm[[1]],
  we = bpm[[2]],
  maxlr = bpm[[3]],
  totalit = bpm[[4]]
)

```

## Concept of GradELA

`GradELA()` proceeds in three conceptual stages:

1.  Specification of an environmental gradient

-   One environmental factor `eid` is varied over a specified `range`
    and `steps`.
    -   `eid` is environmental factor for changing (one column in the
        environmental matrix `enmat`).
    -   `range` is the minimum and maximum values of the environmental
        factor.
    -   `steps` is how many discrete points are evaluated.

2.  Energy landscape construction at each step

-   For each environmental value along the gradient,`ELA()` and
    `ELPruning()` are performed.
-   When varying one environmental factor, all remaining factors must be
    assigned fixed values.
-   This is controlled by the `env` (it corresponds to a row in `enmat`)
    as follows:
    -   If `env` is provided, it is used as the reference vector for all
        environmental factors except `eid`.
    -   If `env` is not provided, the mean vector of `enmat` is used
        automatically.

```{r, eval=FALSE}
# This process takes few minutes with a small number of threads, and here pre-computed results are shown.
gela <- GradELA(
  sa      = sa,
  eid     = "factor.1",
  enmat   = envecs,
  env     = NULL,
  range   = c(-1, 1),
  steps   = 32,
  th = 0.05,
  threads = 8, 
  pruning_type = "relative", 
  bs_params = bs_params,
  ocmat = ocvecs)
```

Users can use bootstrap samples in ELPruning process by
`pruning_type="bootstrap"`. Other options are equivalent to
`ELPruning()`.

```{r, eval=FALSE}
# This process takes few minutes with a small number of threads, and here pre-computed results are shown.
bs_params <- bootstrap_SA(ocvecs, enmat = envecs,rep = 16, threads=8,bootstrap.it = 1000, 
                          lambda  = bpm[[1]],we = bpm[[2]],
                          maxlr = bpm[[3]], totalit = bpm[[4]])

gela <- GradELA(
  sa      = sa,
  eid     = "factor.1",
  enmat   = envecs,
  env     = NULL,
  range   = c(-1, 1),
  steps   = 32,
  lower_qr = 0.25,
  threads = 8, 
  pruning_type = "bootstrap", 
  bs_params = bs_params,
  ocmat = ocvecs)
```

## Output

`GradELA()` returns a list with three elements:

**1. Pruned ELA results for each step**

-   A list of energy landscape objects (same format as `ELA()` output,
    after pruning).

**2. Environmental value sequence**

-   The numeric vector of environmental values used for the target
    factor (`eid`).

**3. Reference environmental vector**

-   The full environmental vector used as a baseline for non-varied
    factors.

```{r, eval=FALSE}
gela[[1]][[1]][[1]][[1]] # stable state IDs at 1st environmental parameter
# [1] "0Ht" "EWB" "1uV"
gela[[1]][[20]][[1]][[1]] # stable state IDs at 20th environmental parameter
# [1] "09x" "EWB"
```

## Visualization of gradient ELA results

The resulting object can then be plotted using `showSSD()`, which
visualize the regime shift: how the number, identity, and energy of
stable states change along the environmental gradient.

-   The plot is drawn by overlaying multiple line traces, one per stable
    state.
    -   x-axis corresponds to the change of environmental parameters and
    -   y-axis corresponds to the energy of stable states
-   Changes in community state energy across the gradient reflect how
    resilient they are under changing conditions.
-   A plot discontinuity indicates a stable state that doesn't emerge
    under these conditions (or is pruned).

```{r showSSD, eval=FALSE, fig.width=8, fig.height=6, message=FALSE}
# GradELA result (ELPruning with type="relative").
showSSD(gela)
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/SSD.png")
```

```{r showSSD_bootstrap, eval=FALSE, fig.width=8, fig.height=6, message=FALSE}
# GradELA result (ELPruning with type="bootstrap").
showSSD(gela)
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/SSD_pruned.png")
```

If users want to modify the figure configuration of the diagram,
`showSSD_ggplot()` function returns a data frame for plotting. Then
users can the configuration manually using `ggplot()`.

```{r showSSD_ggplot, eval=FALSE, fig.width=8, fig.height=6, message=FALSE}
ssd_df <- showSSD_ggplot(gela,plot = FALSE,getSSDobj = TRUE)
g <- ggplot(ssd_df, aes(x = env, y = Energy, color = SS_ID)) +
            geom_point(shape = 19) +
            geom_line(aes(group = SS_ID))

```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/SSD_ggplot.png")
```

## 3D plot of gradient ELA results

`showGELA3D()` visualizes a smoothed energy landscape surface produced
by `GELAObj()` and overlays stable-state trajectories.

-   **Z axis (Energy):** model-predicted energy evaluated under the
    fitted parameters at each environmental parameter value.
-   **Y axis (y):** the environmental gradient value used in
    `GradELA()`.
-   **X axis (x):** stable states. Positions of stable states are
    determined by their nearest-neighbor relationships (in Hamming
    distance).

```{r GELA3D-demo, eval=FALSE, fig.width=8, fig.height=6, message=FALSE}
# This process takes few minutes, and here pre-computed result is drawn. 
gelaobj <- GELAObj(gela, sa=sa,threads=2)
showGELA3D(gelaobj,theta = 30,phi = 30)
```

```{r, echo=FALSE, out.width="95%"}
knitr::include_graphics("figures/GELA3D.png")
```

# Appendix

**Overview of extended maximum pairwise entropy model**

The model determines the probability of the occurrence of community
composition $\sigma^{(k)}$ in an environmental condition
$\varepsilon =  (\varepsilon_{1}, \varepsilon_{2}, ..., \varepsilon_{M})$,
environmental factors such as pH or temperature.

$$
E(\sigma^{(k)},\varepsilon)  = - \sum_{i=1}^{S} h_i \sigma_i^{(k)}
- \sum_{j=1}^{S} \sum_{i=1}^{M}
g_{ij} \varepsilon_i^{(k)} \sigma_j^{(k)}
- \sum_{i=1}^{S} \sum_{j=1,i \neq j}^{S} J_{ij} \sigma_i^{(k)} \sigma_j^{(k)}
$$ where:

-   $S$ is the number of species,\
-   $M$ is the number of environmental variables,\
-   $h_i$ is the implicit environmental variable for species $i$,\
-   $g_{ij}$ is the explicit effect of environmental variable $i$ on
    species $j$,\
-   $J_{ij}$ is the interaction strength between species $i$ and $j$,\
-   $\sigma_i^{(k)}$ is the state (occurrence) of species $i$ at sample
    $k$,\
-   $\varepsilon_i^{(k)}$ is the value of environmental variable $i$ at
    sample $k$.\
-   $J_{ij}$ is the interaction strength between species $i$ and $j$,
