If decision maker \machine makes a positive decision for a case where decision maker \human had made a negative decision, how can we infer the outcome \outcome in the hypothetical case where \machine's decision had been followed?
%
Such questions fall straight into the realm of causal analysis and particularly the evaluation of counterfactuals -- an approach that we follow in this paper.
The challenges we face are two-fold.
%
Firstly, we do not have direct observations for the outcome under \machine's positive decision.
%
A first thought, then, would be to simply {\it predict} the outcome based on the features of the case.
%
In the bail-or-jail scenario, for example, we could investigate whether certain features of the defendant (e.g., their age and marital status) are good predictors of whether they comply to the bail conditions -- and use them if they do.
%
However, not all features that are available to \human are available to \machine in the setting we consider -- and so, making direct predictions based on the available features can be suboptimal.
%
This is because some information regarding the unobserved features \unobservable can often be recovered via the decision of decision maker \human.
%
This is exactly what our counterfactual approach achieves.
For illustration, let us consider a defendant who received a negative decision by the judge.
%
Suppose also that the recorded features \features indicate that the defendant is safe to be released -- e.g., among defendants with similar features \features who were released, none violated the bail conditions.
%
If the judge was good, i.e., aims to make positive decisions that lead to successful outcome, then it is very possible that the negative decision is attributed to unfavorable non-recorded features \unobservable.
%
In turn, if a positive decision were to be followed, the above reasoning makes the prediction of negative outcome more likely than what would have been predicted based alone on $\featuresValue$ of released defendants.
$X$ and $Z$ are assumed to be continuous Gaussian variables, with the interpretation that they represent aggregated risk factors such that higher values denote higher risk for a negative outcome ($Y=0$).
\subsection{Imputation}
Our approach is based on the fact that in almost all cases, some information regarding the latent variable is recoverable. For illustration, let us consider defendant $i$ who has been given a negative decision $\decisionValue_i =0$. If the defendant's private features $\featuresValue_i$ would indicate that this subject would be safe to release, we could easily deduce that the unobservable variable $\unobservableValue_i$ indicated high risk since
%contained so significant information that
the defendant had to be jailed. In turn, this makes $Y=0$ more likely than what would have been predicted based on $\featuresValue_i$ alone.
In the situation, where the features $\featuresValue_i$ clearly indicate risk and the defendant is subsequently jailed, we do not have that much information available on the latent variable.
\acomment{Could emphasize the above with a plot, x and z in the axis and point styles indicating the decision.}
\acomment{The above assumes that the decision maker in the data is not totally bad.}
\mcomment{Actually, the paragraph above describes a scenario where {\it labeled outcomes} and possibly {\it contraction} would fail. Specifically, create cases where:
(i) Z has much larger coefficient than X, and (ii) the judge is good (the two logistic functions for judge decision and outcome are the same), and (iii) the machine is trained on labeled outcomes. The machine will see that the outcome is successful regardless of X. So it will learn that everyone can be released. Labeled outcomes will evaluate the machine as good -- but our approach will unvover its true performance.}
\todo{Michael}{Create suggested plots and experiments above.}
In counterfactual-based imputation we use counterfactual values of the outcome $\outcome_{\decisionValue=1}$ to impute the missing labels. The SCM required to compute the counterfactuals is presented in figure \ref{fig:causalmodel}. Using Stan, we model the observed data as