Newer
Older
The inverse cumulative function $F^{-1}(r)$ in equations \ref{eq:Tprob} and \ref{eq:Tdet} is the inverse cumulative distribution of \emph{logit-normal distribution} with parameters $\mu=0$ and $\sigma^2=2$, i.e. $F^{-1}$ is the inverse cumulative distribution function of the sum of two standard Gaussians after logistic transformation. If $\beta_X \neq 1$ and/or $\beta_Z \neq 1$ then from the basic properties of variance $\sigma^2=Var(\beta_XX+\beta_ZZ)=\beta_X^2Var(X)+\beta_Z^2Var(Z)$. Finally the inverse cumulative function
\begin{equation} \label{eq:cum_inv}
F^{-1}(r) = \invlogit\left(\text{erf}^{-1}(2r-1)\sqrt{2\sigma^2}-\mu\right)
\end{equation}
where the parameters are as discussed and erf is the error function.
With this knowledge, it can be stated that if we observed $T=0$ with some $x$ and $r$ it must have been that $\invlogit(\beta_Xx+\beta_Zz) \geq F^{-1}(r)$. Using basic algebra we obtain that
\invlogit(x + z) \geq F^{-1}(r) \Leftrightarrow x+z \geq logit(F^{-1}(r)) \Leftrightarrow z \geq logit(F^{-1}(r)) - x
as the logit and its inverse are strictly increasing functions and hence preserve the order of magnitude for all pairs of values in their domains. From equations \ref{eq:posterior_Z}, \ref{eq:Tprob} and \ref{eq:bounds} we can conclude that $\pr(Z < \invlogit(F^{-1}(r)) - x | T=0, X=x, R=r) = 0$ and that elsewhere the distribution of Z follows a standard Gaussian. The expectation of Z can be computed analytically from the formulas for truncated Gaussians. All this follows analogically for cases with $T=1$ with the changes of some inequalities.
In practise, in lines 1--3 and 10--13 of algorithm \ref{alg:eval:mc} we do as in the True evaluation evaluator algorithm with the distinction that some of the values of Y are imputed with the corresponding counterfactual probabilities. In line 4 we compute the bounds as motivated above. In the for-loop (lines 5--8) we merely compute the expectation of Z given the knowledge of the decision and that the distribution of Z follows a truncated Gaussian. The equation
\begin{equation}
\hat{z} = (1-t) \cdot E(Z | Z > Q_r) + t \cdot E(Z | Z < Q_r)
\end{equation}
computes the correct expectation automatically. Using the expectation, we then compute the probability for the counterfactual $\pr(Y(1) = 0)$ (probability of a negative outcome had a positive decision been given). In line 9 the imputation can be performed in couple of ways: either by taking a random guess with probability $\pr(Y(1) = 0)$ or by assigning the most likely value for Y.
\begin{algorithm}[H] % enter the algorithm environment
\caption{Evaluator module: Analytic solution} % give the algorithm a caption
\label{alg:eval:mc} % and a label for \ref{} commands later in the document
\begin{algorithmic}[1] % enter the algorithmic environment
\REQUIRE Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$, acceptance rate r
\ENSURE
\STATE Split data to test set and training set.
\STATE Train a predictive model $\B$ on training data.
\STATE Estimate and assign probability scores $\s$ using $\B$ for all observations in test data.
\STATE Compute bounds $Q_r = logit(F^{-1}(r)) - x$ for all judges.
\FORALL{observations in test set}
\STATE Compute expectation $\hat{z} = (1-t) \cdot E(Z | Z > Q_r) + t \cdot E(Z | Z < Q_r)$. %
\STATE Compute $\pr(Y(1) = 0) = \invlogit(x + \hat{z})$.
\STATE Impute missing observations using the estimates $\pr(Y(1) = 0)$.
\STATE Sort the data by the probabilities $\s$ to ascending order.
\STATE \hskip3.0em $\rhd$ Now the most dangerous subjects are last.
\STATE Calculate the number to release $N_{free} = |\D_{test}| \cdot r$.
\RETURN Compute $\frac{1}{|\D_{test}|}\sum_{i=1}^{N_{free}}\delta\{y_i=0\}$ using the observed and imputed observations.
\end{algorithmic}
\end{algorithm}
Summary table of different modules.
\begin{table}[H]
\caption{Summary of modules (under construction)}
\begin{tabular}{lll}
\toprule
\multicolumn{3}{c}{Module type} \\[.5\normalbaselineskip]
\textbf{Data generator} & \textbf{Decider} & \textbf{Evaluator} \\
\midrule
{\ul Without unobservables} & {\ul Independent decisions} & {\ul Labeled outcomes} \\
& 1. draw T from a Bernoulli & \tabitem Data $\D$ with properties $\{x_i, t_i, y_i\}$ \\
{\ul With unobservables} & with $P(T=0|X, Z)$ & \tabitem acceptance rate r \\
\tabitem $P(Y=0|X, Z, W)$ & & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
{\ul With unobservables} & 2. determine with $F^{-1}(r)$ & {\ul True evaluation} \\
\tabitem assign $Y=1$ & & \tabitem Data $\D$ with properties $\{x_i, t_i, y_i\}$ \\
if $P(Y=0|X, Z, W) \geq 0.5$ & {\ul Non-independent decisions} & and \emph{all outcome labels} \\
& 3. sort by $P(T=0|X, Z)$ & \tabitem acceptance rate r \\
& and assign $t$ by $r$ & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
& & {\ul Human evaluation} \\
& & \tabitem Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$ \\
& & \tabitem acceptance rate r \\[.5\normalbaselineskip]
& & \tabitem Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$ \\
& & \tabitem acceptance rate r \\
& & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
% & & {\ul Causal model} \\
% & & \tabitem Data $\D$ with properties $\{x_i, t_i, y_i\}$ \\
% & & \tabitem acceptance rate r \\
% & & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
& & \tabitem Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$ \\
& & \tabitem acceptance rate r \\
& & \tabitem knowledge that X affects Y \\
& & \tabitem more intricate knowledge about $\M$ ? \\[.5\normalbaselineskip]
& & {\ul Potential outcomes evaluator} \\
& & \tabitem Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$ \\
& & \tabitem acceptance rate r \\
& & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
\section{Old results} \label{sec:results}
Results obtained from running algorithm \ref{alg:perf_comp} are presented in table \ref{tab:results} and figure \ref{fig:results}. All parameters are in their default values and a logistic regression model is trained.
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
\begin{table}[H]
\centering
\caption{Mean absolute error (MAE) w.r.t true evaluation. \\ \emph{RL: Updated 26 June.}}
\begin{tabular}{l | c c}
Method & MAE without Z & MAE with Z \\ \hline
Labeled outcomes & 0.107249375 & 0.0827844\\
Human evaluation & 0.002383729 & 0.0042517\\
Contraction & 0.004633164 & 0.0075497\\
Causal model, ep & 0.000598624 & 0.0411532\\
\end{tabular}
\label{tab:results}
\end{table}
\begin{figure}[]
\centering
\begin{subfigure}[b]{0.5\textwidth}
\includegraphics[width=\textwidth]{sl_without_Z_8iter}
\caption{Results without unobservables}
\label{fig:results_without_Z}
\end{subfigure}
~ %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.5\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_8iter_betaZ_1_0}
\caption{Results with unobservables, $\beta_Z=1$.}
\label{fig:results_with_Z}
\end{subfigure}
\caption{Failure rate vs. acceptance rate with varying levels of leniency. Logistic regression was trained on labeled training data. \emph{RL: Updated 26 June.}}
\label{fig:results}
\end{figure}
\subsection{$\beta_Z=0$ and data generated with unobservables.}
If we assign $\beta_Z=0$, almost all failure rates drop to zero in the interval 0.1, ..., 0.3 but the human evaluation failure rate. Results are presented in figures \ref{fig:betaZ_1_5} and \ref{fig:betaZ_0}.
The disparities between figures \ref{fig:results_without_Z} and \ref{fig:betaZ_0} (result without unobservables and with $\beta_Z=0$) can be explained in the slight difference in the data generating process, namely the effect of $\epsilon$. The effect of adding $\epsilon$ (noise to the decisions) is further explored in section \ref{sec:epsilon}.
\begin{figure}[]
\centering
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_4iter_betaZ_1_5}
\caption{Results with unobservables, $\beta_Z$ set to 1.5 in algorithm \ref{alg:data_with_Z}.}
\label{fig:betaZ_1_5}
\end{subfigure}
\quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_4iter_beta0}
\caption{Results with unobservables, $\beta_Z$ set to 0 in algorithm \ref{alg:data_with_Z}.}
\label{fig:betaZ_0}
\end{subfigure}
\caption{Effect of $\beta_z$. Failure rate vs. acceptance rate with unobservables in the data (see algorithm \ref{alg:data_with_Z}). Logistic regression was trained on labeled training data. Results from algorithm \ref{alg:perf_comp}.}
\label{fig:betaZ_comp}
\end{figure}
\subsection{Noise added to the decision and data generated without unobservables} \label{sec:epsilon}
In this part, Gaussian noise with zero mean and 0.1 variance was added to the probabilities $P(Y=0|X=x)$ after sampling Y but before ordering the observations in line 5 of algorithm \ref{alg:data_without_Z}. Results are presented in Figure \ref{fig:sigma_figure}.
\begin{figure}[]
\centering
\includegraphics[width=0.5\textwidth]{sl_without_Z_3iter_sigma_sqrt_01}
\caption{Failure rate with varying levels of leniency without unobservables. Noise has been added to the decision probabilities. Logistic regression was trained on labeled training data.}
\label{fig:sigma_figure}
\end{figure}
\subsection{Predictions with random forest classifier} \label{sec:random_forest}
In this section the predictive model was switched to random forest classifier to examine the effect of changing the predictive model. Results are practically identical to those presented in figure \ref{fig:results} previously and are presented in figure \ref{fig:random_forest}.
\begin{figure}[]
\centering
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_withoutZ_4iter_randomforest}
\caption{Results without unobservables.}
\label{fig:results_without_Z_rf}
\end{subfigure}
\quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_withZ_6iter_betaZ_1_0_randomforest}
\caption{Results with unobservables, $\beta_Z=1$.}
\label{fig:results_with_Z_rf}
\end{subfigure}
\caption{Failure rate vs. acceptance rate with varying levels of leniency. Random forest classifier was trained on labeled training data}
\label{fig:random_forest}
\end{figure}
\subsection{Sanity check for predictions}
Predictions were checked by drawing a graph of predicted Y versus X, results are presented in figure \ref{fig:sanity_check}. The figure indicates that the predicted class labels and the probabilities for them are consistent with the ground truth.
\begin{figure}[]
\centering
\includegraphics[width=0.5\textwidth]{sanity_check}
\caption{Predicted class label and probability of $Y=1$ versus X. Prediction was done with a logistic regression model. Colors of the points denote ground truth (yellow = 1, purple = 0). Data set was created with the unobservables.}
\label{fig:sanity_check}
\end{figure}
\subsection{Fully random model $\M$}
Given our framework defined in section \ref{sec:framework}, the results presented next are with model $\M$ that outputs probabilities 0.5 for every instance of $x$. Labeling process is still as presented in algorithm \ref{alg:data_with_Z}.
\begin{figure}[]
\centering
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_without_Z_15iter_random_model}
\caption{Failure rate vs. acceptance rate. Data without unobservables. Machine predictions with random model.}
\label{fig:random_predictions_without_Z}
\end{subfigure}
\quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_15iter_fully_random_model}
\caption{Failure rate vs. acceptance rate. Data with unobservables. Machine predictions with random model.}
\label{fig:random_predictions_with_Z}
\end{subfigure}
\caption{Failure rate vs. acceptance rate with varying levels of leniency. Machine predictions were done with completely random model, that is prediction $P(Y=0|X=x)=0.5$ for all $x$.}
\label{fig:random_predictions}
\end{figure}
\subsection{Modular framework -- Monte Carlo evaluator} \label{sec:modules_mc}
For these results, data was generated either with module in algorithm \ref{alg:dg:coinflip_with_z} (drawing Y from Bernoulli distribution with parameter $\pr(Y=0|X, Z, W)$ as previously) or with module in algorithm \ref{alg:dg:threshold_with_Z} (assign Y based on the value of $\invlogit(\beta_XX+\beta_ZZ)$). Decisions were determined using one of the two modules: module in algorithm \ref{alg:decider:quantile} (decision based on quantiles) or \ref{alg:decider:lakkaraju} ("human" decision-maker as in \cite{lakkaraju17}). Curves were computed with True evaluation (algorithm \ref{alg:eval:true_eval}), Labeled outcomes (\ref{alg:eval:labeled_outcomes}), Human evaluation (\ref{alg:eval:human_eval}), Contraction (\ref{alg:eval:contraction}) and Monte Carlo evaluators (\ref{alg:eval:mc}). Results are presented in figure \ref{fig:modules_mc}. The corresponding MAEs are presented in table \ref{tab:modules_mc}.
From the result table we can see that the MAE is at the lowest when the data generating process corresponds closely to the Monte Carlo algorithm.
\begin{table}[]
\centering
\caption{Mean absolute error w.r.t true evaluation. See modules used in section \ref{sec:modules_mc}. Bern = Bernoulli, indep. = independent, TH = threshold}
\begin{tabular}{l | c c c c}
Method & Bern + indep. & Bern + non-indep. & TH + indep. & TH + non-indep.\\ \hline
Labeled outcomes & 0.111075 & 0.103235 & 0.108506 & 0.0970325\\
Human evaluation & 0.027298 & NaN (TBA) & 0.049582 & 0.0033916\\
Contraction & 0.004206 & 0.004656 & 0.005557 & 0.0034591\\
Monte Carlo & 0.001292 & 0.016629 & 0.009429 & 0.0179825\\
\end{tabular}
\label{tab:modules_mc}
\end{table}
\begin{figure}[]
\centering
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_10iter_coinflip_quantile_defaults_mc}
\caption{Outcome Y from Bernoulli, independent decisions using the quantiles.}
%\label{fig:modules_mc_without_Z}
\end{subfigure}
\quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_20iter_threshold_quantile_defaults_mc}
\caption{Outcome Y from threshold rule, independent decisions using the quantiles.}
%\label{fig:modules_mc_with_Z}
\end{subfigure}
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_10iter_coinflip_lakkarajudecider_defaults_mc}
\caption{Outcome Y from Bernoulli, non-independent decisions.}
%\label{fig:modules_mc_without_Z}
\end{subfigure}
\quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc.
%(or a blank line to force the subfigure onto a new line)
\begin{subfigure}[b]{0.475\textwidth}
\includegraphics[width=\textwidth]{sl_with_Z_10iter_threshold_lakkarajudecider_defaults_mc}
\caption{Outcome Y from threshold rule, non-independent decisions.}
%\label{fig:modules_mc_with_Z}
\end{subfigure}
\caption{Failure rate vs. acceptance rate with varying levels of leniency. Different combinations of deciders and data generation modules. See other modules used in section \ref{sec:modules_mc}}
\label{fig:modules_mc}
\end{figure}
\section{Diagnostic figures} \label{sec:diagnostic}
Here we present supplementary figures of all the settings in the main result section.
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_1_sl_bernoulli_independent_without_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_2_sl_bernoulli_independent_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_3_sl_threshold_independent_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_4_sl_bernoulli_batch_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_5_sl_threshold_batch_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_6_sl_random_decider_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_7_sl_biased_decider_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_8_sl_bad_decider_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_9_sl_bernoulli_bernoulli_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_10_sl_threshold_batch_beta_Z_3_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{expanded_model_11_sl_threshold_batch_beta_Z_5_with_Z_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods.}
%\label{fig:}
\end{figure}
\begin{figure}[]
\centering
\includegraphics[width=\textwidth]{sl_compas_diagnostic_plot}
\caption{Results from estimating failure rate with different levels of leniency using different methods in the COMPAS data set.}
\label{fig:diagnostic_compas}
\end{figure}
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
%\begin{wrapfigure}{r}{0.3\textwidth} %this figure will be at the right
% \centering
% \begin{tikzpicture}[->,>=stealth',node distance=2cm, semithick]
%
% \tikzstyle{every state}=[fill=none,draw=black,text=black]
%
% \node[state] (R) {$R$};
% \node[state] (T) [right of=R] {$T$};
% \node[state] (X) [above of=T] {$X$};
% \node[state] (Z) [rectangle, below of=T] {$Z$};
% \node[state] (Y) [right of=T] {$Y$};
%
% \path (R) edge node [below] {$\alpha_j$} (T)
% (X) edge node [left] {$\beta_{xt}$} (T)
% edge node [right] {$\beta_{xy}$} (Y)
% (Z) edge node [left] {$\beta_{zt}$} (T)
% edge node [right] {$\beta_{zy}$} (Y)
% (T) edge node [below] {?} (Y);
%\end{tikzpicture}
%\caption{Initial model.}
%\end{wrapfigure}