@@ -388,6 +388,19 @@ We use a propensity score framework to model $X$ and $Z$: they are assumed conti
\end{itemize}
\end{itemize}
\begin{algorithm}
%\item Potential outcomes / CBI \acomment{Put this in section 3? Algorithm box with these?}
\begin{itemize}
\item Take test set
\item Compute the posterior for parameters and variables presented in equation \ref{eq:data_model}.
\item Using the posterior predictive distribution, draw estimates for the counterfactuals.
\item Impute the missing outcomes using the estimates from previous step
\item Obtain a point estimate for the failure rate by computing the mean.
\item Estimates for the counterfactuals Y(1) for the unobserved values of Y were obtained using the posterior expectations from Stan. We used the NUTS sampler to estimate the posterior. When the values for...
\end{itemize}
\caption{Counterfactual based imputation}\end{algorithm}
\section{Extension To Non-Linearity (2nd priority)}
% If X has multiple dimensions or the relationships between the features and the outcomes are clearly non-linear the presented approach can be extended to accomodate non-lineairty. Jung proposed that... Groups... etc etc.
...
...
@@ -396,6 +409,38 @@ We use a propensity score framework to model $X$ and $Z$: they are assumed conti
\begin{itemize}
\item Lakkaraju and contraction. \cite{lakkaraju2017selective}
\item Contraction
\begin{itemize}
\item Algorithm by Lakkaraju et al. Assumes that the subjects are assigned to the judges at random and requires that the judges differ in leniency.
\item Can estimate the true failure only up to the leniency of the most lenient decision-maker.
\item Performance is affected by the number of people judged by the most lenient decision-maker, the agreement rate and the leniency of the most lenient decision-maker. (Performance is guaranteed / better when ...)
\item Works only on binary outcomes
\item (We show that our method isn't constrained by any of these)
\item The algorithm goes as follows...
%\begin{algorithm}[] % enter the algorithm environment
%\caption{Contraction algorithm \cite{lakkaraju17}} % give the algorithm a caption
%\label{alg:contraction} % and a label for \ref{} commands later in the document
%\begin{algorithmic}[1] % enter the algorithmic environment
%\REQUIRE Labeled test data $\D$ with probabilities $\s$ and \emph{missing outcome labels} for observations with $T=0$, acceptance rate r
%\ENSURE
%\STATE Let $q$ be the decision-maker with highest acceptance rate in $\D$.
%\STATE $\D_q = \{(x, j, t, y) \in \D|j=q\}$
%\STATE \hskip3.0em $\rhd$ $\D_q$ is the set of all observations judged by $q$
%\STATE
%\STATE $\RR_q = \{(x, j, t, y) \in \D_q|t=1\}$
%\STATE \hskip3.0em $\rhd$ $\RR_q$ is the set of observations in $\D_q$ with observed outcome labels
%\STATE
%\STATE Sort observations in $\RR_q$ in descending order of confidence scores $\s$ and assign to $\RR_q^{sort}$.
%\STATE \hskip3.0em $\rhd$ Observations deemed as high risk by the black-box model $\mathcal{B}$ are at the top of this list
%\STATE
%\STATE Remove the top $[(1.0-r)|\D_q |]-[|\D_q |-|\RR_q |]$ observations of $\RR_q^{sort}$ and call this list $\mathcal{R_B}$
%\STATE \hskip3.0em $\rhd$ $\mathcal{R_B}$ is the list of observations assigned to $t = 1$ by $\mathcal{B}$
\item Approach of Jung et al for optimal policy construction. \cite{jung2018algorithmic}
\item Discussions of latent confounders in multiple contexts.
...
...
@@ -451,47 +496,8 @@ We treat the observations as independent and the still the leniency would be a g
\item Vanilla estimator of a model's performance. Obtained by first ordering the observations by the predictions assigned by the decider in the modelling step.
\item Then 1-r \% of the most dangerous are detained and given a negative decision. The failure rate is computed as the ratio of negative outcomes to the number of subjects.
\end{itemize}
\item Contraction
\begin{itemize}
\item Algorithm by Lakkaraju et al. Assumes that the subjects are assigned to the judges at random and requires that the judges differ in leniency.
\item Can estimate the true failure only up to the leniency of the most lenient decision-maker.
\item Performance is affected by the number of people judged by the most lenient decision-maker, the agreement rate and the leniency of the most lenient decision-maker. (Performance is guaranteed / better when ...)
\item Works only on binary outcomes
\item (We show that our method isn't constrained by any of these)
\item The algorithm goes as follows...
%\begin{algorithm}[] % enter the algorithm environment
%\caption{Contraction algorithm \cite{lakkaraju17}} % give the algorithm a caption
%\label{alg:contraction} % and a label for \ref{} commands later in the document
%\begin{algorithmic}[1] % enter the algorithmic environment
%\REQUIRE Labeled test data $\D$ with probabilities $\s$ and \emph{missing outcome labels} for observations with $T=0$, acceptance rate r
%\ENSURE
%\STATE Let $q$ be the decision-maker with highest acceptance rate in $\D$.
%\STATE $\D_q = \{(x, j, t, y) \in \D|j=q\}$
%\STATE \hskip3.0em $\rhd$ $\D_q$ is the set of all observations judged by $q$
%\STATE
%\STATE $\RR_q = \{(x, j, t, y) \in \D_q|t=1\}$
%\STATE \hskip3.0em $\rhd$ $\RR_q$ is the set of observations in $\D_q$ with observed outcome labels
%\STATE
%\STATE Sort observations in $\RR_q$ in descending order of confidence scores $\s$ and assign to $\RR_q^{sort}$.
%\STATE \hskip3.0em $\rhd$ Observations deemed as high risk by the black-box model $\mathcal{B}$ are at the top of this list
%\STATE
%\STATE Remove the top $[(1.0-r)|\D_q |]-[|\D_q |-|\RR_q |]$ observations of $\RR_q^{sort}$ and call this list $\mathcal{R_B}$
%\STATE \hskip3.0em $\rhd$ $\mathcal{R_B}$ is the list of observations assigned to $t = 1$ by $\mathcal{B}$
\item Compute the posterior for parameters and variables presented in equation \ref{eq:data_model}.
\item Using the posterior predictive distribution, draw estimates for the counterfactuals.
\item Impute the missing outcomes using the estimates from previous step
\item Obtain a point estimate for the failure rate by computing the mean.
\item Estimates for the counterfactuals Y(1) for the unobserved values of Y were obtained using the posterior expectations from Stan. We used the NUTS sampler to estimate the posterior. When the values for...
\end{itemize}
\end{itemize}
\paragraph{Results}
(Target for this section from problem formulation: show that our evaluator is unbiased/accurate (show mean absolute error), robust to changes in data generation (some table perhaps, at least should discuss situations when the decisions are bad/biased/random = non-informative or misleading), also if the decider in the modelling step is bad and its information is used as input, what happens.)