Background
I need to estimate the average treatment effect of being above vs. below the median on a continuous predictor that exhibits non-linearities. This quantity answers the research question:
“What is the causal effect of crossing a meaningful threshold while preserving the non-linear functional form?”
Problem Statement
The marginaleffects package handles point-to-point comparisons and categorical comparisons well, but lacks native support for range-to-range comparisons on continuous predictors.
Current Status
Below you can find the reproducible code for testing how the quantity that I want differs from the quantity that I can get using hypothesis = difference ~ pairwise
# Minimal Reproducible Example: Range-based comparisons for continuous predictors library(marginaleffects) # Load mtcars and create weight grouping variable data(mtcars) mtcars$weight_group <- ifelse(mtcars$wt > median(mtcars$wt), "heavy", "light") # Fit model with continuous wt (preserving non-linearities) mod <- lm(mpg ~ am * wt, data = mtcars) # Create counterfactual grid nd <- datagrid(newdata = mtcars, wt = unique, disp = unique , grid_type = "counterfactual") # Method 1: Vincent's suggested approach using grouping categorical variable # This works when you have a categorical grouping variable avg_predictions(mod, by = "weight_group", newdata = nd, hypothesis = difference ~ pairwise) # Method 2: What I actually need - comparing ranges of continuous predictor # Below median wt values (roughly wt <= 3.325) scene_1 <- avg_predictions(mod, newdata = datagrid(wt = mtcars$wt[mtcars$wt <= median(mtcars$wt)], am = unique, disp = unique, grid_type = "counterfactual")) # Above median wt values (roughly wt > 3.325) scene_2 <- avg_predictions(mod, newdata = datagrid(wt = mtcars$wt[mtcars$wt > median(mtcars$wt)], am = unique, disp = unique, grid_type = "counterfactual")) # Manual calculation of difference manual_diff <- scene_1$estimate - scene_2$estimate # Print results cat("Method 1 (categorical grouping):", avg_predictions(mod, by = "weight_group", newdata = nd, hypothesis = difference ~ pairwise)$estimate, "\n") cat("Method 2 (range-based manual):", manual_diff, "\n") # The problem: These should be equivalent but they're not! # Method 1 treats weight_group as categorical in the prediction grid # Method 2 preserves the continuous nature of wt while comparing ranges Method 1 (categorical grouping): -1.247234 Method 2 (range-based manual): 6.870543
Key Issues
Mathematical Contradiction: These two methods produce different results despite conceptually estimating the same threshold effect. Method 1 relies on a categorical grouping variable in the prediction grid, while Method 2 directly compares ranges of the continuous predictor while preserving its functional form.
Missing Uncertainty Quantification: The manual approach (Method 2) loses standard errors, confidence intervals, and p-values that are essential for statistical inference.
Questions
Why do these methods produce different results when they should estimate the effect of the same weight threshold?
Is there native support for range-to-range comparisons that preserves the continuous specification?
The manual approach appears methodologically correct for my research question, but I lose the statistical inference capabilities that make marginaleffects so valuable. Any guidance would be greatly appreciated!
NOTE: This is a toy example, my real model has more variables and meaningful non-linearities.