Type: Package
Title: Reconstructing Antibody Dynamics to Estimate the Risk of Influenza Virus Infection
Version: 1.1.5
Maintainer: Tim Tsang <timkltsang@gmail.com>
Description: A Bayesian framework for inferring influenza infection status from serial antibody measurements. Jointly estimates season-specific infection probabilities, antibody boosting and waning after infection, and baseline hemagglutination inhibition (HAI) titer distributions via Markov chain Monte Carlo (MCMC). Supports multi-season analysis and subgroup comparisons via a group_by interface. See Tsang et al. (2022) <doi:10.1038/s41467-022-29310-8> for methodological details.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Encoding: UTF-8
LazyData: TRUE
URL: https://github.com/timktsang/seroreconstruct
BugReports: https://github.com/timktsang/seroreconstruct/issues
Depends: R (≥ 3.5.0)
Imports: Rcpp (≥ 1.0.9), RcppParallel
LinkingTo: Rcpp, RcppArmadillo, RcppParallel
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown
Config/testthat/edition: 3
VignetteBuilder: knitr
SystemRequirements: TBB
RoxygenNote: 7.3.3
NeedsCompilation: yes
Packaged: 2026-03-30 14:58:37 UTC; timtsang
Author: Tim Tsang ORCID iD [aut, cre]
Repository: CRAN
Date/Publication: 2026-04-03 08:50:03 UTC

seroreconstruct: Reconstructing Antibody Dynamics to Estimate the Risk of Influenza Virus Infection

Description

A Bayesian framework for inferring influenza infection status from serial antibody measurements. Jointly estimates season-specific infection probabilities, antibody boosting and waning after infection, and baseline HAI titer distributions via MCMC. Supports multi-season analysis and subgroup comparisons via a group_by interface. See Tsang et al. (2022) doi:10.1038/s41467-022-29310-8 for methodological details.

Author(s)

Maintainer: Tim Tsang timkltsang@gmail.com (ORCID)

See Also

Useful links:


Compute effective sample size from MCMC chain

Description

Uses the initial positive sequence estimator (Geyer 1992).

Usage

.effective_sample_size(x)

Arguments

x

Numeric vector of MCMC samples.

Value

Effective sample size (numeric scalar).


Run MCMC for a single fit

Description

Core fitting function used by sero_reconstruct for both standard 3-age-group fits and single-group sub-fits.

Usage

.fit_single(inputdata, inputILI, n_iteration, burnin, thinning, n_groups = 3L)

Arguments

inputdata

Data frame of individual-level data.

inputILI

Data frame or matrix of influenza activity.

n_iteration

Number of MCMC iterations.

burnin

Burn-in iterations.

thinning

Thinning interval.

n_groups

Number of effective age groups (3 for standard, 1 for single-group).

Value

A seroreconstruct_fit object.


Get human-readable parameter names from a fit object

Description

Get human-readable parameter names from a fit object

Usage

.get_param_names(fit)

Arguments

fit

A seroreconstruct_fit or derived object.

Value

Character vector of parameter names.


Compute MCMC summary statistics

Description

Compute MCMC summary statistics

Usage

.mcmc_summary(mcmc_matrix)

Arguments

mcmc_matrix

Matrix of posterior samples (rows = iterations, cols = parameters).

Value

A matrix with ncol(mcmc_matrix) rows and 4 columns: mean, 2.5% quantile, 97.5% quantile, and acceptance rate.


Constructor for seroreconstruct_fit objects

Description

Constructor for seroreconstruct_fit objects

Usage

.new_seroreconstruct_fit(
  output,
  n_groups,
  n_individuals,
  n_posterior_samples,
  n_seasons = 1L,
  runtime_secs
)

Constructor for seroreconstruct_joint objects

Description

Constructor for seroreconstruct_joint objects

Usage

.new_seroreconstruct_joint(fit, group_labels, group_sizes, shared, n_groups)

Constructor for seroreconstruct_multi objects

Description

Constructor for seroreconstruct_multi objects

Usage

.new_seroreconstruct_multi(fits, group_labels, group_sizes)

Plot MCMC trace plots

Description

Plot MCMC trace plots

Usage

.plot_traces(mcmc_matrix, nrow, ncol)

Arguments

mcmc_matrix

Matrix of posterior samples.

nrow

Number of rows in the plot layout.

ncol

Number of columns in the plot layout.


Prepare input data for MCMC or simulation

Description

Reorders columns, adds padding columns, adjusts times for boosting delay, and converts to matrix format expected by the C++ backend.

Usage

.prepare_inputs(inputdata, inputILI)

Arguments

inputdata

Data frame with 9 required columns.

inputILI

Data frame or matrix of influenza activity.

Value

A list with prepared inputdata and inputILI matrices.


Validate inputs for sero_reconstruct

Description

Validate inputs for sero_reconstruct

Usage

.validate_inputs(
  inputdata,
  inputILI,
  n_iteration,
  burnin,
  thinning,
  group_by = NULL
)

Arguments

inputdata

Data frame of individual-level data.

inputILI

Data frame or matrix of influenza activity.

n_iteration

Number of MCMC iterations.

burnin

Burn-in iterations.

thinning

Thinning interval.

group_by

Optional formula; when non-NULL, age_group value checks are skipped.

Value

The (possibly column-renamed) inputdata.


Validate simulation parameters

Description

Validate simulation parameters

Usage

.validate_simulation_params(para1, para2, n_seasons = 1L, n_groups = 3L)

Arguments

para1

Numeric vector of active model parameters (length 6 + 4 * n_seasons).

para2

Numeric vector of baseline HAI titer distribution (length 20 * n_seasons).

n_seasons

Number of seasons. Default 1.


Subset a seroreconstruct_multi object

Description

Access an individual group fit by name or index.

Usage

## S3 method for class 'seroreconstruct_multi'
x[[i, ...]]

Arguments

x

A seroreconstruct_multi object.

i

Group name (character) or index (integer).

...

Additional arguments (ignored).

Value

A seroreconstruct_fit object for the requested group.


Example of flu activity data

Description

This is an example of the flu activity data used in the seroreconstruct function. This data frame specifies the format of the flu activity data.

Usage

data(flu_activity)

Format

A data frame with 1 variable, where each row represents a date, and it should match the date in the input data:

h1.activity

This is the influenza activity from surveillance data. It can be on a relative scale, as the model includes a scale parameter to estimate infection probability.

See Also

Other example_data: para1, para2


Example of input data

Description

This is an example of the input data used in the seroreconstruct function. This data frame illustrates the format of the input data.

Usage

data(inputdata)

Format

A data frame with 9 variables, where each row represents an individual:

age_group

0: children, 1: adults, 2: older adults

start_time

start of follow-up

end_time

end of follow-up

time1

date of first serum collection

time2

date of second serum collection

time3

date of third serum collection

HAI_titer_1

HAI titer for first serum collection

HAI_titer_2

HAI titer for second serum collection

HAI_titer3

HAI titer for third serum collection


Extract the model estimates from the fitted MCMC

Description

output_model_estimate is deprecated; use summary() instead.

Usage

output_model_estimate(fitted_MCMC, period)

Arguments

fitted_MCMC

A seroreconstruct_fit object, or a list returned by an older version of sero_reconstruct().

period

A vector indicating the start and the end of a season to compute the infection probabilities. If empty, the start and end of the season are inferred from the data.

Value

A data frame of model estimates (invisibly).

Examples

## Not run: 
a1 <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
fitted_result <- output_model_estimate(a1)  # deprecated, use summary(a1)

## End(Not run)

Example of parameter vector for the main model

Description

This is an example of the parameter vector for the main model used in the seroreconstruct function. This data frame specifies the format of the parameter vector for the main model.

Usage

data(para1)

Format

A numeric vector with 10 elements (for a single-season model, S = 1). The general length is 6 + 4*S where S is the number of seasons.

Elements 1–6 (shared)

1) random measurement error, 2) 2-fold error, 3) boosting for children (log2), 4) waning for children (log2), 5) boosting for adults (log2), 6) waning for adults (log2).

Elements 7–9 (per-season)

infection risk scale parameters for children, adults, and older adults (3 per season).

Element 10 (per-season)

log risk ratio of 2-fold increase in baseline HAI titer (1 per season).

See Also

Other example_data: flu_activity, para2


Example of parameter vector for the baseline HAI titer for the main model

Description

This is an example of the parameter vector for the baseline HAI titer for the main model used in the seroreconstruct function. This data frame specifies the format of the parameter vector for the baseline HAI titer for the main model.

Usage

data(para2)

Format

A numeric vector with 20 elements (for a single-season model, S = 1). The general length is 20 * S where S is the number of seasons.

Elements 1–10

probability that the HAI titer is 0–9 for children

Elements 11–20

probability that the HAI titer is 0–9 for adults

For multi-season models, this pattern repeats for each season.

See Also

Other example_data: flu_activity, para1


Plot posterior boosting distributions

Description

Draws violin plots of the posterior fold-rise in antibody titer after infection, one violin per boosting parameter group. Matches the style of Figure 1C in Tsang et al. (2022).

Usage

plot_boosting(fit, cols = NULL, main = NULL, show_legend = TRUE, ...)

Arguments

fit

A seroreconstruct_fit or seroreconstruct_joint object. Only single-season fits are currently supported.

cols

Optional character vector of colors, one per group.

main

Optional plot title.

show_legend

Logical; whether to draw a legend. Default TRUE.

...

Additional graphical parameters passed to plot().

Value

Invisible NULL. Called for its side effect of producing a plot.

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
plot_boosting(fit)


MCMC diagnostic plots

Description

Produces trace plots and posterior density plots for each model parameter. Trace plots show the MCMC chain with the posterior mean (red dashed line). Density plots show the marginal posterior with 95% credible interval bounds (blue dashed lines).

Usage

plot_diagnostics(fit, params = NULL)

Arguments

fit

A seroreconstruct_fit, seroreconstruct_joint, or seroreconstruct_multi object.

params

Optional character vector of parameter names to plot. If NULL (default), all parameters are plotted. Use table_parameters(fit)$Parameter to see available names.

Value

Invisible NULL. Called for its side effect of producing plots.

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
# Plot selected parameters (use params = NULL for all)
plot_diagnostics(fit, params = c("random_error", "twofold_error"))


Plot infection probabilities (forest plot)

Description

Forest plot showing posterior infection probabilities with 95% credible intervals. Supports single fits, multi-group fits, and combining results from multiple fits with section headers.

Usage

plot_infection_prob(
  fits,
  labels = NULL,
  main = NULL,
  file = NULL,
  width = 8,
  height = NULL,
  xlim = NULL,
  cex = 0.85,
  ...
)

Arguments

fits

A seroreconstruct_fit, seroreconstruct_joint, seroreconstruct_multi object, or a named list of fit objects. When a named list is provided, names are used as section headers.

labels

Optional character vector of custom labels for the strata within each fit. For a named list of fits, use a list of character vectors.

main

Optional plot title.

file

Optional file path for PDF output. Default: NULL (current device).

width

PDF width in inches. Default: 8.

height

PDF height in inches. Default: auto-calculated.

xlim

Numeric vector of length 2 for the x-axis range (probability scale, e.g. c(0, 0.5)). Default: auto-determined.

cex

Character expansion factor. Default: 0.85.

...

Additional graphical parameters passed to plot().

Value

Invisible data frame of the plotted estimates (Stratum, Probability, Lower, Upper).

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
plot_infection_prob(fit)


Plot antibody trajectory with model fit

Description

For a single individual, draws posterior-sampled antibody trajectories overlaid on observed HAI titers. Red lines show trajectories where infection occurred; blue lines show trajectories without infection. Matches the visualization style of Figure 1B in Tsang et al. (2022).

Usage

plot_trajectory(
  fit,
  id = 1,
  subjects = NULL,
  n_samples = 100,
  main = NULL,
  col_infected = NULL,
  col_uninfected = NULL,
  show_legend = TRUE,
  ...
)

Arguments

fit

A seroreconstruct_fit or seroreconstruct_joint object.

id

Row index (integer) or subject identifier to plot. Numeric values in the valid row range (1 to N) are treated as 1-based row indices. Numeric values outside that range are looked up in subject_ids or subjects if available. Non-numeric values are always looked up by subject identifier.

subjects

Optional vector of subject identifiers aligned with fit rows. Use this when subject_ids was not provided at fitting time. Example: subjects = inputdata$subject_id.

n_samples

Number of posterior samples to draw. Default 100.

main

Optional plot title. If NULL, a default title with the individual index and posterior infection probability is generated.

col_infected

Color for infected trajectories. Default semi-transparent red.

col_uninfected

Color for uninfected trajectories. Default semi-transparent blue.

show_legend

Logical; whether to draw a legend. Default TRUE.

...

Additional graphical parameters passed to plot().

Value

Invisible NULL. Called for its side effect of producing a plot.

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
plot_trajectory(fit, id = 1)


Plot posterior waning curves

Description

Shows the fraction of peak antibody remaining over time since infection, with posterior median and 95% credible band for each waning parameter group. Matches the style of Figure 1D in Tsang et al. (2022).

Usage

plot_waning(fit, days = 400, cols = NULL, main = NULL, show_legend = TRUE, ...)

Arguments

fit

A seroreconstruct_fit or seroreconstruct_joint object. Only single-season fits are currently supported.

days

Maximum number of days to plot on the x-axis. Default 400.

cols

Optional character vector of colors, one per group.

main

Optional plot title.

show_legend

Logical; whether to draw a legend. Default TRUE.

...

Additional graphical parameters passed to plot().

Value

Invisible NULL. Called for its side effect of producing a plot.

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
plot_waning(fit)


Print method for seroreconstruct_fit

Description

Print method for seroreconstruct_fit

Usage

## S3 method for class 'seroreconstruct_fit'
print(x, ...)

Arguments

x

A seroreconstruct_fit object.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Print method for seroreconstruct_joint

Description

Print method for seroreconstruct_joint

Usage

## S3 method for class 'seroreconstruct_joint'
print(x, ...)

Arguments

x

A seroreconstruct_joint object.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Print method for seroreconstruct_multi

Description

Print method for seroreconstruct_multi

Usage

## S3 method for class 'seroreconstruct_multi'
print(x, ...)

Arguments

x

A seroreconstruct_multi object.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Print method for summary.seroreconstruct_fit

Description

Print method for summary.seroreconstruct_fit

Usage

## S3 method for class 'summary.seroreconstruct_fit'
print(x, digits = 2, ...)

Arguments

x

A summary.seroreconstruct_fit object.

digits

Number of decimal places for rounding. Default 2.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Print method for summary.seroreconstruct_joint

Description

Print method for summary.seroreconstruct_joint

Usage

## S3 method for class 'summary.seroreconstruct_joint'
print(x, digits = 2, ...)

Arguments

x

A summary.seroreconstruct_joint object.

digits

Number of decimal places for rounding. Default 2.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Print method for summary.seroreconstruct_multi

Description

Print method for summary.seroreconstruct_multi

Usage

## S3 method for class 'summary.seroreconstruct_multi'
print(x, digits = 2, ...)

Arguments

x

A summary.seroreconstruct_multi object.

digits

Number of decimal places for rounding. Default 2.

...

Additional arguments (ignored).

Value

The input object x, invisibly.


Run the MCMC for the Bayesian model

Description

The main function to run the MCMC for the Bayesian model, to obtain individual dynamics, model parameters such as infection probability, boosting, waning, and measurement error.

Usage

sero_reconstruct(
  inputdata,
  inputILI,
  n_iteration = 2000,
  burnin = 1000,
  thinning = 1,
  group_by = NULL,
  shared = NULL,
  subject_ids = NULL
)

Arguments

inputdata

The data for running MCMC, in dataframe format. It should be in the same format as the data in the package. It includes: 1) age_group (0: children, 1: adults, 2: older adults), 2) start_time: start of follow-up, 3) end_time: end of follow-up, 4) time1: date for first serum collection, 5) time2: date for second serum collection, 6) time3: date for third serum collection, 7) HAI_titer_1: HAI titer for first serum collection, 8) HAI_titer_2: HAI titer for second serum collection, 9) HAI_titer_3: HAI titer for third serum collection.

inputILI

The data for influenza activity used in the inference. The row number should match with the date in the inputdata.

n_iteration

The number of iterations of the MCMC.

burnin

The iteration for burn-in for MCMC.

thinning

The number of thinning in MCMC.

group_by

Optional formula specifying grouping variables (e.g., ~age_group). When provided, independent MCMCs are fit for each combination of the grouping variables. The formula uses interaction semantics: ~age + vac means all age-by-vac combinations. Returns a seroreconstruct_multi object.

shared

Optional character vector specifying which parameters to share across groups when group_by is also provided. Measurement error parameters are always shared (they are a lab assay property, identical across groups). Valid values: "error" (measurement error only, the default when shared is non-NULL), "boosting_waning" (also share antibody boosting and waning across groups). When specified, a single joint MCMC is run with all groups pooled together, sharing the specified parameters while estimating group-specific infection probabilities. Returns a seroreconstruct_joint object.

subject_ids

Optional vector (character, numeric, or factor) of subject identifiers, one per row of inputdata. When provided, stored in the fit object and used by plot_trajectory() to look up individuals by ID rather than row index. Example: subject_ids = inputdata$household_id.

Details

Multi-season support: If inputdata contains an optional integer column named season (0-indexed, contiguous from 0 to n_seasons - 1), the model fits season-specific infection risk and HAI protection parameters. When no season column is present, all individuals are assigned to a single season (n_seasons = 1) and behavior is identical to previous versions. Validated with simulation recovery studies up to 7 seasons.

Shared parameters: When shared is provided together with group_by, a single joint MCMC chain is run with all individuals pooled. Measurement error and boosting/waning parameters are shared across groups (informed by all data), while infection risk and HAI protection parameters remain group-specific. This is more statistically efficient than independent chains when groups share biological or measurement properties.

Single-group design: When using group_by without shared, independent MCMCs are fit for each group. To compare children vs adults, fit each group separately using group_by = ~age_group.

Current limitation: summary() is not yet implemented for fits with n_seasons > 1. Multi-season posterior samples are accessible directly from the fit object (e.g., fit$posterior_model_parameter).

Value

A seroreconstruct_fit object (when group_by is NULL) or a seroreconstruct_multi object (when group_by is provided). Use summary() to extract model estimates.

Examples


a1 <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
summary(a1)


Simulation of the dataset of the Bayesian model

Description

The function to simulate the dataset, for validation or other purpose.

Usage

simulate_data(inputdata, inputILI, para1, para2, n_groups = 3L)

Arguments

inputdata

The data with the same format that for running MCMC, in dataframe format.

inputILI

The data for influenza activity used in the inference. The row number should match with the date in the inputdata.

para1

Numeric vector of active model parameters. Length depends on the number of seasons S (determined by the season column in inputdata, default S = 1):

  • Elements 1–6 (shared): 1) random measurement error, 2) 2-fold error, 3) boosting for children (log2), 4) waning for children (log2), 5) boosting for adults (log2), 6) waning for adults (log2).

  • Elements 7 to 6 + 3*S (per-season): infection risk scale parameters for children, adults, and older adults, repeated for each season.

  • Elements 6 + 3*S + 1 to 6 + 4*S (per-season): log risk ratio of 2-fold increase in baseline HAI titer, one per season.

Total length: 6 + (G + 1)*S where G is n_groups (e.g., 10 for G=3 S=1, 34 for G=3 S=7). See para1 for an example with G = 3, S = 1.

para2

Numeric vector for baseline HAI titer distributions. Length 20 * S: for each season, 10 probabilities for children (HAI titer levels 0–9) followed by 10 probabilities for adults. See para2 for an example with S = 1.

n_groups

Number of groups for infection risk parameters (default 3 for the standard 3-age-group model).

Value

A simulated data based on the input parameter vectors, with the format equal to the input data.

Examples

simulated <- simulate_data(inputdata, flu_activity, para1, para2)

Summary method for seroreconstruct_fit

Description

Computes estimates of infection probabilities, boosting, waning, and measurement error from a fitted MCMC object.

Usage

## S3 method for class 'seroreconstruct_fit'
summary(object, period, ...)

Arguments

object

A seroreconstruct_fit object.

period

Optional numeric vector of length 2 specifying the start and end of a season to compute infection probabilities. If omitted, the full follow-up period is used.

...

Additional arguments (ignored).

Value

A summary.seroreconstruct_fit object with element $table.


Summary method for seroreconstruct_joint

Description

Computes shared parameter estimates and per-group infection probabilities from a joint fit with shared parameters.

Usage

## S3 method for class 'seroreconstruct_joint'
summary(object, period, ...)

Arguments

object

A seroreconstruct_joint object.

period

Optional numeric vector of length 2 specifying the start and end of a season to compute infection probabilities.

...

Additional arguments (ignored).

Value

A summary.seroreconstruct_joint object with element $table.


Summary method for seroreconstruct_multi

Description

Computes estimates for each group and combines into a single table.

Usage

## S3 method for class 'seroreconstruct_multi'
summary(object, period, ...)

Arguments

object

A seroreconstruct_multi object.

period

Optional numeric vector of length 2 specifying the start and end of a season to compute infection probabilities.

...

Additional arguments (ignored).

Value

A summary.seroreconstruct_multi object with element $table.


Per-individual infection estimates

Description

Summarizes posterior infection status, timing, and baseline titer for each individual in the dataset.

Usage

table_infections(fit)

Arguments

fit

A seroreconstruct_fit or seroreconstruct_joint object.

Value

A data frame with one row per individual and columns: Individual (row index), Infection_prob (posterior mean probability of infection), Infection_time_mean (mean infection time among infected samples), Baseline_titer_mean (mean imputed baseline HAI titer).

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
head(table_infections(fit))


Summary table of model parameters with credible intervals

Description

Extracts posterior summaries (mean, median, credible intervals) for all active model parameters.

Usage

table_parameters(fit, probs = c(0.025, 0.975))

Arguments

fit

A seroreconstruct_fit, seroreconstruct_joint, or seroreconstruct_multi object.

probs

Numeric vector of length 2 giving the lower and upper quantile probabilities for the credible interval. Default c(0.025, 0.975) for a 95% interval.

Value

A data frame with columns: Parameter, Mean, Median, Lower, Upper.

Examples


fit <- sero_reconstruct(inputdata, flu_activity,
                        n_iteration = 2000, burnin = 1000, thinning = 1)
table_parameters(fit)