---
title: "Getting started with anovapowersim"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started with anovapowersim}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4
)
```

`anovapowersim` simulates power for balanced factorial ANOVA
designs. Specify the factors/levels, the term of interest, and a target partial
eta squared. `anovapowersim` generates default term-specific cell means,
simulates datasets, refits the ANOVA with `stats::aov()`, and estimates power.

```{r setup, message=FALSE}
library(anovapowersim)
```

```{r load-precomputed-results, include=FALSE}
vignette_results_path <- system.file(
  "extdata",
  "anovapowersim-vignette-results.rds",
  package = "anovapowersim"
)
if (!nzchar(vignette_results_path)) {
  vignette_results_path <- file.path(
    "..",
    "inst",
    "extdata",
    "anovapowersim-vignette-results.rds"
  )
}
vignette_results <- readRDS(vignette_results_path)
```

## Search for the required sample size

The easiest way to get your required sample size is to use `power_n()` to
search for the sample size needed to reach the requested `power`.

This example is a 2 x 2 mixed design with one between-subjects factor
(`cond`) and one within-subject factor (`stim`).

We specify that we are interested in the `cond:stim` interaction, and that we want to have 80% power to detect a partial eta squared of 0.14. `power_n()` will search for the required sample size per between-subject cell, so `n = 13` gives total `N = 26`.

```{r adaptive-code, eval=FALSE}
power_n(
  between = c(cond = 2), # cond has 2 levels
  within = c(stim = 4), # stim has 4 levels
  term = "cond:stim",
  target_pes = 0.14,
  alpha = 0.05,
  power = 0.80,
  n_sims = 1000, # use 5000+ for a more precise estimate
  seed = 123 # for reproducibility
)
```

```{r adaptive-output, echo=FALSE}
vignette_results$adaptive
```

Note: here we use 1000 simulations for a quick example, but the package defaults to 10000 simulations for more precise estimates.

The output table uses compact column names: `n_per_cell` is the sample size per
between-subject cell, `total_n` is the full sample size, `num_df` and `den_df`
are the ANOVA degrees of freedom, `ncp` is the noncentrality parameter,
`power_calc` is the noncentral F power calculation, and `power_sim` is the
simulation estimate.

### Adding factors and levels

You can add factors and levels as needed, and specify any term of interest. For, example if we want to add a between condition with 3 levels, and we are interested in the 3-way interaction, we can do:

```{r complex, eval=FALSE}
power_n(
  between = c(cond = 2, age = 3), # cond has 2 levels, age has 3 levels
  within = c(stim = 4), # stim has 4 levels
  term = "cond:stim:age",
  target_pes = 0.14,
  alpha = 0.05,
  power = 0.80,
  n_sims = 1000, # use 5000+ for a more precise estimate
  seed = 123 # for reproducibility
)
```

## Simulate a power curve

You might want to see how power changes across a range of sample sizes. `power_curve()` simulates power across a range of sample sizes, which you can specify with `n_range`. The result is a tidy data frame that you can plot with `plot_power_curve()`.

```{r curve-fixed-code, eval=FALSE}
pc <- power_curve(
  between = c(cond = 2),
  within = c(stim = 2),
  term = "cond:stim",
  target_pes = 0.14,
  n_range = c(16, 20, 23, 28), # n per between-subject cell
  n_sims = 1000,
  seed = 123
)
pc
```

```{r curve-fixed-output, echo=FALSE}
pc <- vignette_results$curve
pc
```

```{r plot-fixed, echo=FALSE}
plot_power_curve(
  pc,
  power_lines = c(.80, .90) # adds horizontal lines at 80% and 90% power
)
```

## Advanced options

### Run simulations in parallel

For larger simulation runs, set `parallel = TRUE`. If you do not set `cores`,
`anovapowersim` uses one fewer than the number of available cores and prints a
message with the chosen count. Set `cores` explicitly when you want a fixed
number of cores.

```{r parallel, eval=FALSE}
power_curve(
  between = c(cond = 2),
  within = c(stim = 2),
  term = "cond:stim",
  target_pes = 0.14,
  n_range = c(16, 20, 23, 28),
  n_sims = 5000,
  parallel = TRUE,
  cores = 4,
  seed = 123
)
```

### Match the G\*Power convention

By default, `anovapowersim` calibrates the simulated cell means so the empirical
reference dataset has the requested partial eta squared under the fitted
`stats::aov()` model. This corresponds to the fitted ANOVA denominator-df
noncentrality convention.

Set `gpower = TRUE` when you want the G\*Power-style convention (when using the 'as in Cohen (1988) option for within-subjects designs)
`lambda = total_n * f^2`. 

```{r gpower-adaptive-code, eval=FALSE}
power_n(
  between = c(cond = 2),
  within = c(stim = 4),
  term = "cond:stim",
  target_pes = 0.14,
  alpha = 0.05,
  power = 0.80,
  n_sims = 1000,
  seed = 123,
  gpower = TRUE
)
```

```{r gpower-adaptive-output, echo=FALSE}
vignette_results$gpower_adaptive
```
