We use the following structural equation model over the graph structure in Figure~2:
\noindent
\hrulefill
\begin{align}
R & := \epsilon_R, \quad%\epsilon_r \sim N(0,\sigma_z^2)
\nonumber\\
Z & := \epsilon_Z, \quad%\epsilon_z \sim N(0,\sigma_z^2)
\nonumber\\
X & := \epsilon_X, \quad%\epsilon_z \sim N(0,\sigma_z^2)
\nonumber\\
T & := g(R,X,Z,\epsilon_T), \nonumber\\
Y & := f(T,X,Z,\epsilon_Y). \nonumber
\end{align}
\hrulefill
For any cases where $T=0$ in the data, we calculate the counterfactual value of $Y$ if we had had $T=1$. We use the approach by Pearl consisting of three steps abduction, action prediction. We describe first what happens on fixed parameters and later generalize to the case where parameters are learned from data.
In the abduction step we update the distribution of the disturbance terms $(\epsilon_R, \epsilon_Z, \epsilon_X, \epsilon_T,\epsilon_Y)$ to take into account the evidence $T=0,Y=1,X=x$. At this point we make use of the additional information a negative decision has on the unobserved risk factor $Z$. We can directly update
\section{Counterfactual-Based Imputation For Selective Labels}
\label{sec:imputation}
...
...
@@ -160,23 +185,7 @@ Note that index $j$ refers to decision maker $\human_j$ and \invlogit is the sta
\hrulefill
\noindent
\hrulefill
\begin{align}
Z & := \epsilon_z, \quad\epsilon_z \sim N(0,1) \nonumber\\