Sensitivity Analysis

When we conduct causal inference to the observational data, the most important assumption is that there is no unobserved confounding. Therefore, after finishing the treatment effect estimation, investigators are advised to conduct the sensitivity analysis to examine how fragile a result is against the possibility of unobserved confounders (Cinelli, Hazlett, 2020). In other words, we should examine how strong the effect of unobserved confounders should be to erase the treatment effect estimated.

Two methods are provided in our package, including Omitted variable bias based sensitivity analysis method (Cinelli, Hazlett, 2020) and Wilcoxon’s signed rank test based sensitivity analysis method (Rosenbaum, 2015).

1️⃣ OVB

Note

This method can be used if your result variable \(Y\) is linearly dependent with \(X\) and \(T\).

The Omitted variable bias based sensitivity analysis result gives us the following informations, and all of these informations can be found in only one picture.

  • Robustness Value (RV):

The robustness value represents the threshold level of association that unobserved confounder must reach, both with the treatment and the outcome, in order to alter the conclusions of the research.

The result figure provides a convenient reference point to assess the overall robustness of a coefficient to unobserved confounders. If the confounder’s association to the treatment \(R_{Y\sim Z|T, X}^2\) and to the outcome \(R_{Z\sim T|X}^2\) are both assumed to be less than the \(RV\), then such confounders cannot “explain away” the observed effect.

  • Contour Line:

The points on the same contour line has the same adjusted estimated ATT. The contour line helps us to know the value of the adjusted estimated \(ATT\) when \(R_{Y\sim Z|T, X}^2 = a\) and \(R_{Z\sim T|X}^2 = b\).

  • Bound the strength of the hidden confounder using observed covariate:

We can choose an observed confounder \(X_j\) as a benchmark, and check the adjusted estimated \(ATT\) when

\[ \begin{align}\begin{aligned}\frac{R_{Y\sim Z|T, X_{-j}}^2}{R_{Y\sim X_j|T, X_{-j}}^2} = K_Y\\\frac{R_{T\sim Z|X_{-j}}^2}{R_{T\sim X_j|X_{-j}}^2} = K_T\end{aligned}\end{align} \]

Advantages

  • Having no parametric assumptions on the distribution of the confounder.

  • Having simple sensitivity measures for routine reporting.

  • Connecting sensitivity analysis to domain knowledge.

Limitations

  • Assuming that the unobservable confounder \(X\) is linearly dependent with result variable \(Y\) and treatment \(T\).

Example

Here we choose \(X_1\) as our benchmark. When \(K_Y = K_T = 0.2\), the adjusted estimated \(ATT\) is 2.9722.

When \(R_{Y\sim Z|T, X}^2 = R_{Z\sim T|X}^2 = 0.6803\), the unobserved confounder \(X\) will “explain away” the observed effect.

from CEM_LinearInf.sensitivity_analysis import ovb
import statsmodels.api as sm
import numpy as np

X = sm.add_constant(my_cem.matched_df[[my_cem.col_t] + [f'X{i}' for i in range(1, 10)]])
y = np.asarray(my_cem.matched_df[my_cem.col_y])
model = sm.WLS(y.astype(float), X.astype(float), weights=1)

my_ovb = ovb(model=model, bench_variable='X1', k_t = [0.2, 0.5], k_y=[0.2, 0.5],  measure = 'att')
my_ovb.plot_result()
smd_result

2️⃣ Wilcoxon’s signed rank test

Note

It is suitable for 1-1 matched dataset, which means that only 1 untreated sample are matched with each treated sample, and this can be achieved by setting k2k_ratio = 1 in the match step.

Wilcoxon’s signed rank test based sensitivity analysis imagines that in the population before matching, all samples are assigned to treatment or control independently with unknown probabilities. However, two samples with the same observed confounders may nonetheless differ in terms of unobserved confounders, so that one sample has an odds of treatment that is up to \(\Gamma ≥ 1\) times greater than the odds for another sample.

The sensitivity analysis asks how large the \(\Gamma\) should be to erase the treatment effect estimated.

With \(S\) being the number of sample pairs and \(W\) being the sum of the ranks of the positive differences between pairs, we have

\[\lambda = \frac{\Gamma}{1 + \Gamma}, \mu_{max} = \frac{\lambda S (S+1)}{2}, \mu_{min} = \frac{(1-\lambda) S (S+1)}{2}, \sigma ^2 = \frac{\lambda(1-\lambda)S(S+1)(2S+1)}{6}\]

\(Z=\frac{W-\mu}{\sigma}\) follows the standard normal distribution. If the corresponding \(p-value\) is greater than 0.05, than we can reject the null hypothesis that the treatment is randomly assigned, which means that the unobserved confounder erases the treatment effect estimated.

Example

In the following example, when \(\Gamma = 4.25\), the upper bound of the \(p-value\)’s interval is greater than 0.05, which means that in this situation, we don’t have 95% confidence to reject the null hypothesis that the treatment is randomly assigned. In other words, when \(\Gamma = 4.25\) the estimated \(ATT\) will be explained away by unovserved confounders.

from CEM_LinearInf.sensitivity_analysis import wilcoxon

my_sen = wilcoxon(df=my_cem_k2k.matched_df, pair = my_cem_k2k.pair)
wilcoxon_df = my_sen.result([1, 2, 3, 4, 4.25, 5])
       lower_p  upper_p
gamma
1.00       0.0   0.0000
2.00       0.0   0.0000
3.00       0.0   0.0000
4.00       0.0   0.0112
4.25       0.0   0.0575
5.00       0.0   0.6223
The estimated ATT result is not reliable if there exists an unobservable confounder which makes the magnitude of probability
that a single subject will be interfered with is 4.25 times higher than that of the other subject.

⭐️ Reference

  • Cinelli, C., & Hazlett, C. (2020). Making Sense of Sensitivity: Extending Omitted Variable Bias. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(1), 39–67. https://doi.org/10.1111/rssb.12348

  • Rosenbaum, P. R. (2005). Sensitivity analysis in observational studies. Encyclopedia of statistics in behavioral science.