\antti{Here one can drop do even at the first line according to do-calculus rule 2, i.e. $P(Y=0|do(R=r))=P(Y=0|R=r)$. However, do-calculus formulas should be computed by first learning a graphical model and then computing the marginals using the graphical model. This gives more accurate result. Michael's complicated formula essentially does this, including forcing $P(Y=0|T=0,X)=0$ (the model supports context-specific independence $Y \perp X | T=0$.)}
\antti{Here one can drop do even at the first line according to do-calculus rule 2, i.e. $P(Y=0|do(R=r))=P(Y=0|R=r)$. However, do-calculus formulas should be computed by first learning a graphical model and then computing the marginals using the graphical model. This gives more accurate result. Michael's complicated formula essentially does this, including forcing $P(Y=0|T=0,X)=0$ (the model supports context-specific independence $Y \perp X |T=0$.)}
Expanding the above derivation for model \score{\featuresValue} learned from the data
Expanding the above derivation for model \score{\featuresValue} learned from the data
\[
\[
...
@@ -217,20 +217,20 @@ The causal model for this scenario corresponds to that depicted in Figure \ref{f
...
@@ -217,20 +217,20 @@ The causal model for this scenario corresponds to that depicted in Figure \ref{f
For the analysis, we assigned 500 subjects to each of the 100 judges randomly.
For the analysis, we assigned 500 subjects to each of the 100 judges randomly.
Every judge's leniency rate $\leniency$ was sampled uniformly from a half-open interval $[0.1; 0.9)$.
Every judge's leniency rate $\leniency$ was sampled uniformly from a half-open interval $[0.1; 0.9)$.
Private features $\features$ were defined as i.i.d standard Gaussian random variables.
Private features $\features$ were defined as i.i.d standard Gaussian random variables.
Next, probabilities for negative results $\outcome=0$ were modeled as Bernoulli distributed
Next, probabilities for negative results $\outcome=0$ were calculated as
and then the result variable $\outcome$ was sampled from Bernoulli distribution with parameter $1-\frac{1}{1+\exp\{-\featuresValue\}}$.
The decision variable $\decision$ was set to 0 if the probability $\prob{\outcome=0| \features=\featuresValue}$ resided in the top $(1-\leniencyValue)\cdot100\%$ of the subjects appointed for that judge.\antti{How was the final Y determined? I assume $Y=1$ if $T=0$, if $T=1$$Y$ was randomly sampled from $\prob{\outcome| \features=\featuresValue}$ above? Delete this comment when handled.}
The decision variable $\decision$ was set to 0 if the probability $\prob{\outcome=0| \features=\featuresValue}$ resided in the top $(1-\leniencyValue)\cdot100\%$ of the subjects appointed for that judge.
Results for estimating the causal quantity $\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ with various levels of leniency $\leniencyValue$ under this model are presented in Figure \ref{fig:without_unobservables}.
Results for estimating the causal quantity $\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ with various levels of leniency $\leniencyValue$ under this model are presented in Figure \ref{fig:without_unobservables}.
\caption{$\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ with varying levels of acceptance rate. Error bars denote standard error of the mean.}
\caption{$\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ with varying levels of acceptance rate without unobservables. Error bars denote standard error of the mean.}