Transportability and Policy Learning

This vignette demonstrates two advanced applications of the causaldef framework: 1. Transportability: Generalizing experimental results to a new target population. 2. Policy Learning Bounds: Quantifying the limits of decision-making under confounding.

We utilize classical datasets (Lalonde NSW and Right Heart Catheterization) to illustrate these concepts.

1. Transportability: Lalonde’s Job Training

A common challenge in causal inference is external validity: Can we apply the results of a Randomized Controlled Trial (RCT) to a diferent target population?

We use the Lalonde dataset to simulate a transportability problem. * Source Population (\(S=1\)): The NSW experimental participants (typically disjoint from the general population). * Target Population (\(S=0\)): The CPS comparison group (representative of the broader population).

data("nsw_benchmark")

# Define Source: Experimental Sample
source_data <- subset(nsw_benchmark, sample_id %in% c("nsw_treated", "nsw_control"))

# Define Target: CPS Control Group (Broader population)
target_data <- subset(nsw_benchmark, sample_id == "cps_control")

# Covariates available for transport
transport_vars <- c("age", "education", "black", "hispanic", "married", "nodegree", "re74", "re75")

# Comparison of demographics
print(summary(source_data[, c("age", "education", "re74")]))
#>       age          education         re74        
#>  Min.   :17.00   Min.   : 3.0   Min.   :    0.0  
#>  1st Qu.:20.00   1st Qu.: 9.0   1st Qu.:    0.0  
#>  Median :24.00   Median :10.0   Median :    0.0  
#>  Mean   :25.37   Mean   :10.2   Mean   : 2102.3  
#>  3rd Qu.:28.00   3rd Qu.:11.0   3rd Qu.:  824.4  
#>  Max.   :55.00   Max.   :16.0   Max.   :39570.7
print(summary(target_data[, c("age", "education", "re74")]))
#>       age          education          re74      
#>  Min.   :16.00   Min.   : 0.00   Min.   :    0  
#>  1st Qu.:24.00   1st Qu.:11.00   1st Qu.: 4403  
#>  Median :31.00   Median :12.00   Median :15124  
#>  Mean   :33.23   Mean   :12.03   Mean   :14017  
#>  3rd Qu.:42.00   3rd Qu.:13.00   3rd Qu.:23584  
#>  Max.   :55.00   Max.   :18.00   Max.   :25862

The target population (CPS) is significantly wealthier (re74 mean is much higher) and slightly older. We want to know: What would be the effect of job training if applied to the CPS population?

Transport Deficiency

We calculate the Transport Deficiency \(\delta(E_S, E_T)\). This measures how much information is lost due to the distributional shift between Source and Target.

# Create causal specification for the SOURCE
source_spec <- causal_spec(
  data = source_data,
  treatment = "treat",
  outcome = "re78",
  covariates = transport_vars
)
#> ✔ Created causal specification: n=445, 8 covariate(s)

# Compute Transport Deficiency
trans_def <- transport_deficiency(
  source_spec,
  target_data = target_data,
  transport_vars = transport_vars,
  method = "iptw",
  n_boot = 50 # Low for vignette speed
)
#> ✖ Transport proxy: 0.995 (ESS: 13.3%)
#> ℹ Severe distribution shift; transport may be unreliable

print(trans_def)
#> 
#> ── Transport Diagnostic Analysis ───────────────────────────────────────────────
#> Method: "iptw"
#> Source n: 445 | Target n: 15992
#> 
#> ── Transport Diagnostic ──
#> 
#>   transport proxy:       0.9954
#>   Standard error:        0.0250
#>   95% CI:               [1.1735, 1.2776]
#>   Effective sample size: 59.0 (13.3% of source)
#>   Note: this is a heuristic transport-risk proxy, not an exact transport deficiency.
#> ── Covariate Shift ──
#>            variable shift_metric severity
#> age             age   0.84595761   severe
#> education education   0.76555133   severe
#> black         black   2.36239417   severe
#> hispanic   hispanic   0.05755963      low
#> married     married   1.30665468   severe
#> nodegree   nodegree   1.11659926   severe
#> re74           re74   1.53592792   severe
#> re75           re75   1.77276154   severe
#> ── Effect Estimates ──
#>   ATE (source):     1676.3426
#>   ATE (transport):  778.8984
plot(trans_def, type = "shift")

Interpretation: * Covariate Shift: The plot shows which variables differ most (likely re74 and re75). * Transported ATE: The estimated effect in the target population. * Deficiency: A low delta implies we can reliably transport the result. A high delta warns that the populations are too distinct (lack of overlap or extreme weights).

2. Policy Learning Bounds: RHC

In Policy Learning, we seek an optimal treatment rule \(\pi(X)\) to maximize utility. However, with observational data, our estimate of a policy’s value is biased by confounding.

We use the Right Heart Catheterization (RHC) dataset to evaluate a risk-based policy. * Decision: Treat with RHC? * Outcome: 30-day Mortality (lower is better). * Policy: “Treat only high-risk patients” (e.g., APACHE score > 50).

data("rhc")

# Preprocessing
if (is.factor(rhc$swang1)) rhc$treat <- as.numeric(rhc$swang1) - 1 else rhc$treat <- rhc$swang1
if (is.factor(rhc$dth30)) rhc$outcome <- as.numeric(rhc$dth30) - 1 else rhc$outcome <- rhc$dth30

# Variables for adjustment
covariates <- c("age", "sex", "race", "aps1", "cat1") 

spec_rhc <- causal_spec(
  data = rhc,
  treatment = "treat",
  outcome = "outcome",
  covariates = covariates
)
#> ✔ Created causal specification: n=5735, 5 covariate(s)

Policy Evaluation

We compare two policies: 1. Treat All: Everyone gets RHC. 2. Risk-Based: Treat only if APACHE III score (aps1) > 50.

We estimate the observational value of these policies using IPW.

# Estimate propensity scores for adjustment
ps_model <- glm(treat ~ age + sex + race + aps1 + cat1, data = rhc, family = binomial)
rhc$ps <- predict(ps_model, type = "response")

# Define policies
policy_all <- rep(1, nrow(rhc))
policy_risk <- ifelse(rhc$aps1 > 50, 1, 0)

# Estimate Value (Inverse Propensity Weighted)
# Value = Mean of Y under policy. We want to MINIMIZE mortality.
# Equivalent to Maximizing Survival (1 - Y).
# Let's compute expected mortality.

get_policy_value <- function(policy, treat, outcome, ps) {
  # IPW estimator for policy value
  # Weight = I(A = \pi(X)) / P(A|X)
  w <- (treat == policy) / ifelse(policy == 1, ps, 1 - ps)
  mean(w * outcome) # Expected Mortality
}

val_all <- get_policy_value(policy_all, rhc$treat, rhc$outcome, rhc$ps)
val_risk <- get_policy_value(policy_risk, rhc$treat, rhc$outcome, rhc$ps)

cat("Estimated Mortality (Treat All):", round(val_all, 3), "\n")
#> Estimated Mortality (Treat All): 0.352
cat("Estimated Mortality (Risk-Based):", round(val_risk, 3), "\n")
#> Estimated Mortality (Risk-Based): 0.358

The Safety Floor

Even if the Risk-Based policy looks better, can we trust it? The Safety Floor tells us the worst-case error in our value estimate due to unmeasured confounding.

# 1. Estimate Deficiency of the dataset
defom <- estimate_deficiency(spec_rhc, methods = "iptw", n_boot = 0)
#> ℹ Estimating deficiency: iptw
delta <- defom$estimates["iptw"]

# 2. Compute Safety Floor
# Utility range is [0, 1] (Mortality 0 or 1)
bounds <- policy_regret_bound(defom, utility_range = c(0, 1), method = "iptw")
#> ℹ Transfer penalty: 0.0184 (delta = 0.0184)

print(bounds)
#> 
#> -- Policy Regret Bounds -------------------------------------------------
#> 
#> * Deficiency delta: 0.0184 
#> * Delta mode: point 
#> * Delta method: iptw 
#> * Delta selection: pre-specified method 
#> * Utility range: [0, 1]
#> * Transfer penalty: 0.0184 (additive regret upper bound)
#> * Minimax floor: 0.0092 (worst-case lower bound)
#> 
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#> 
#> Interpretation: Transfer penalty is 1.8 % of utility range given delta

Conclusion: * The Safety Floor represents the irreducible uncertainty. * If the difference between Treat All and Risk-Based is smaller than the safety floor, we cannot be confident the new policy is actually superior to the baseline, regardless of sample size. * This illustrates the fundamental limit of offline policy learning from observational data.