@@ -202,9 +202,9 @@ Moreover, it is easy to see based on the derivations of Eq.\ref{eqn:gp} that our
Below we present our results in various settings. Models are evaluated in contrast to the following quantities:
\begin{itemize}
\item{\it True evaluation:} Depicts the true performance of the predictive model. Constructed by sorting all the labels in the test data (even the ones hidden from the models) by the predicted probabilities and then simulating the acceptance rate at the given level.
\item{\it True evaluation:} Depicts the true performance of the predictive model. Constructed by sorting all the labels in the test data (even the ones hidden from the models) by the predicted probabilities and then simulating the acceptance rate at the given level. (Note: True evaluation can only be evaluated on synthetic data sets.)
\item{\it Labeled outcomes:} Similar to {\it true evaluation} but only available labels with positive decisions $(\decision=1)$ are used.
\item{\it Human evaluation:} Human decision makers with similar leniency levels are grouped and treated as a single decision maker.
\item{\it Human evaluation:} Human decision makers with similar acceptance rates are grouped and treated as a single decision maker.
\item{\it Contraction:} Contraction curve was constructed as explained by Lakkaraju et al. \cite{lakkaraju2017selective}.
\item{\it Causal model, ep:} Curve presents the predicted probability $\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ at various levels of acceptance rate.
\end{itemize}
...
...
@@ -215,13 +215,13 @@ The causal model for this scenario corresponds to that depicted in Figure \ref{f
For the analysis, we assigned 500 subjects to each of the 100 judges randomly.
Every judge's leniency rate $\leniency$ was sampled uniformly from a half-open interval $[0.1; 0.9)$.
Private features $\features$ were defined as i.i.d standard Gaussian random variables.
Next, probabilities for negative results $\outcome=0$ were calculated as
Next, probabilities for negative results $\outcome=0$ were modeled as Bernoulli distributed
and consequently $\outcome\sim\text{Bernoulli}(1- p_{y_0})$.
The decision variable $\decision$ was set to 0 if the value $p_{y_0}$ resided in the top $(1-\leniencyValue)\cdot100\%$ of the subjects appointed for that judge.
The decision variable $\decision$ was set to 0 if the probability $\prob{\outcome=0| \features=\featuresValue}$ resided in the top $(1-\leniencyValue)\cdot100\%$ of the subjects appointed for that judge.
Results for estimating the causal quantity $\prob{\outcome=0 | \doop{\leniency=\leniencyValue}}$ with various levels of leniency $\leniencyValue$ under this model are presented in Figure \ref{fig:without_unobservables}.