---
title: "Displaying nestedLogit Models as LaTeX Equations"
author: "Michael Friendly"
date: "`r Sys.Date()`"
package: nestedLogit
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 2
    number_sections: true
  fig_caption: yes
bibliography: ["references.bib", "packages.bib"]
csl: apa.csl
vignette: >
  %\VignetteIndexEntry{Displaying nestedLogit Models as LaTeX Equations}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  message  = FALSE,
  warning  = FALSE,
  comment  = "#>"
)

.opts <- options(digits = 4)

# packages to be cited in packages.bib
.to.cite <- c("nestedLogit", "equatiomatic", "carData")
```

## Overview

The `equatiomatic` package [@R-equatiomatic] provides a general mechanism for
converting fitted statistical models into LaTeX equations, via the function
`extract_eq()`.  For any model class that has a `broom::tidy()` method,
`extract_eq()` can generate both the symbolic form of the model equation and
a version with fitted coefficient values substituted in.

The `nestedLogit` package [@R-nestedLogit] supports `extract_eq()` for objects
of class `"nestedLogit"` through a dedicated S3 method, `extract_eq.nestedLogit`.
Because a nested logit model is represented internally as a collection of binary
logit sub-models — one for each dichotomy — `extract_eq()` generates a separate
equation for each dichotomy and returns them as a named list.

To use these features, load both packages:

```{r setup}
library(nestedLogit)
library(equatiomatic)
```

## Fitting a nested logit model

We use the `Womenlf` data [@R-carData], which records the labor-force
participation of married women and is included in the `carData` package.
The three-category response `partic` (not working, part-time, full-time) is
decomposed into two nested binary dichotomies:

- **work**: not working vs. working (part-time or full-time)
- **full**: part-time vs. full-time, *among those who work*

```{r model}
data(Womenlf, package = "carData")

comparisons <- logits(
  work = dichotomy("not.work", working = c("parttime", "fulltime")),
  full = dichotomy("parttime", "fulltime")
)

wlf.nested <- nestedLogit(partic ~ hincome + children,
                           dichotomies = comparisons,
                           data = Womenlf)
```

## Symbolic equations

Calling `extract_eq()` on a `"nestedLogit"` object with `submodel = "name"`
returns the equation for a single dichotomy in symbolic (Greek-letter) form,
labeled by its name.  The equation renders automatically in R Markdown and
Quarto documents.

### Work dichotomy (not working vs. working)

```{r eqn-work}
extract_eq(wlf.nested, submodel = "work")
```

### Full-time dichotomy (part-time vs. full-time)

```{r eqn-full}
extract_eq(wlf.nested, submodel = "full")
```

### Logit notation

A new version of `equatiomatic` (v. 0.4.6) implements a `logit_notation` argument to simply the display of the LHS of these
equations as `logit [P()]` rather than `log [P() / 1 - P()]`.

```{r eqn-full-logit}
extract_eq(wlf.nested, submodel = "full", logit_notation = TRUE)
```


## Using `extract_eq()` options

There are a wide variety of options you can pass to `extract_eq()` to control the details of how the equations are rendered
in LaTeX. These include:

* `use_coefs`: Use the model estimates in the equations instead of symbols, a nice way to display a fitted model
* Options for coloring symbols in the equations: `greek_colors`, `var_colors`, `subscript_colors` and others.
* `ital_vars`: 	Logical, defaults to `FALSE`. Should the variable names not be wrapped in the `\operatorname{}` command so they appear in Roman text?


### Equations with fitted coefficients

Passing `use_coefs = TRUE` substitutes the fitted coefficient values into the
equations.

```{r eqn-coef-work}
extract_eq(wlf.nested, use_coefs = TRUE, submodel = "work")
```

```{r eqn-coef-full}
extract_eq(wlf.nested, use_coefs = TRUE, submodel = "full")
```

### Coloring symbols

The `greek_colors` and `var_colors` arguments control the color of the Greek
coefficient symbols and the variable names, respectively.  This can help
distinguish the structural parameters from the predictors when displaying
equations in presentations or documents.

```{r eqn-colors}
extract_eq(wlf.nested,
           greek_colors = "blue",
           submodel = "work")
```

The color arguments accept any R color name or hex code, and can be a vector
to color each symbol differently. `var_colors` needs a named vector of the variable names.

```{r eqn-colors-vec}
extract_eq(wlf.nested,
           greek_colors = c("black", "blue", "blue"),
           var_colors   = c(hincome = "red", children="darkgreen"),
           submodel     = "work")
```

### Equations for individual sub-models

The individual binary logit sub-models (objects of class `"glm"`) can also be
passed directly to `extract_eq()`.  This can be useful when you want to work
with a single dichotomy in isolation.

```{r submodel}
mod.work <- models(wlf.nested, "work")
extract_eq(mod.work)
```

Note that the response is rendered as `..y` — the internal variable name used when fitting the sub-model — rather than as a meaningful label. The `extract_eq()` method for a `"nestedLogit"` model fixes this infelicity.

## Using the raw LaTeX

Each equation returned by `extract_eq()` is an object of class `"equation"` (from
`equatiomatic`), which is a character string containing the LaTeX source.
This renders automatically in R Markdown and Quarto documents.  To access
the raw LaTeX — for example to paste it into a paper or to render it with
another tool such as `katex` — use `as.character()`:

```{r raw-latex}
cat(as.character(extract_eq(wlf.nested, submodel = "work")), "\n\n")

cat(as.character(extract_eq(wlf.nested, submodel = "full")), "\n")
```



```{r write-bib, echo=FALSE}
pkgs <- unique(c(.to.cite, .packages()))
knitr::write_bib(pkgs, file = here::here("vignettes", "packages.bib"))
```

## Alligator food choice: `gators` data

The `gators` data (built into `nestedLogit`) records the primary food choice of
alligators — Other, Fish, or Invertebrates — as a function of body length.
The three-category response is decomposed into two dichotomies using `logits()`:

- **other**: {Other} vs. {Fish, Invertebrates}
- **fish_inv**: {Fish} vs. {Invertebrates}, *among those not eating Other*

```{r gators-model}
data(gators)
gators.dichots <- logits(
  other    = dichotomy("Other", c("Fish", "Invertebrates")),
  fish_inv = dichotomy("Fish", "Invertebrates")
)
gators.dichots

gators.nested <- nestedLogit(food ~ length,
                             dichotomies = gators.dichots,
                             data = gators)
```

Note that the dichotomy name `fish_inv` contains an underscore. Because `_` is
the subscript operator in LaTeX, `extract_eq()` replaces it with `.` in the
displayed equation (the `submodel` argument still uses the original R name).

```{r gators-eqns}
extract_eq(gators.nested, submodel = "other")
extract_eq(gators.nested, submodel = "fish_inv")
```

With fitted coefficients:

```{r gators-eqns-coef}
extract_eq(gators.nested, use_coefs = TRUE, submodel = "other")
extract_eq(gators.nested, use_coefs = TRUE, submodel = "fish_inv")
```

## References

```{r, include=FALSE}
options(.opts)
```
