diff --git a/paper/sl.tex b/paper/sl.tex index 598a773dc30b23bf5653aaaaac5da123b47b3242..a169dec7747e1f3cecfc5e0c7458c5162ce87bf9 100755 --- a/paper/sl.tex +++ b/paper/sl.tex @@ -388,6 +388,19 @@ We use a propensity score framework to model $X$ and $Z$: they are assumed conti \end{itemize} \end{itemize} +\begin{algorithm} + %\item Potential outcomes / CBI \acomment{Put this in section 3? Algorithm box with these?} + \begin{itemize} + \item Take test set + \item Compute the posterior for parameters and variables presented in equation \ref{eq:data_model}. + \item Using the posterior predictive distribution, draw estimates for the counterfactuals. + \item Impute the missing outcomes using the estimates from previous step + \item Obtain a point estimate for the failure rate by computing the mean. + \item Estimates for the counterfactuals Y(1) for the unobserved values of Y were obtained using the posterior expectations from Stan. We used the NUTS sampler to estimate the posterior. When the values for... + \end{itemize} + +\caption{Counterfactual based imputation} \end{algorithm} + \section{Extension To Non-Linearity (2nd priority)} % If X has multiple dimensions or the relationships between the features and the outcomes are clearly non-linear the presented approach can be extended to accomodate non-lineairty. Jung proposed that... Groups... etc etc. @@ -396,6 +409,38 @@ We use a propensity score framework to model $X$ and $Z$: they are assumed conti \begin{itemize} \item Lakkaraju and contraction. \cite{lakkaraju2017selective} + \item Contraction + \begin{itemize} + \item Algorithm by Lakkaraju et al. Assumes that the subjects are assigned to the judges at random and requires that the judges differ in leniency. + \item Can estimate the true failure only up to the leniency of the most lenient decision-maker. + \item Performance is affected by the number of people judged by the most lenient decision-maker, the agreement rate and the leniency of the most lenient decision-maker. (Performance is guaranteed / better when ...) + \item Works only on binary outcomes + \item (We show that our method isn't constrained by any of these) + \item The algorithm goes as follows... +%\begin{algorithm}[] % enter the algorithm environment +%\caption{Contraction algorithm \cite{lakkaraju17}} % give the algorithm a caption +%\label{alg:contraction} % and a label for \ref{} commands later in the document +%\begin{algorithmic}[1] % enter the algorithmic environment +%\REQUIRE Labeled test data $\D$ with probabilities $\s$ and \emph{missing outcome labels} for observations with $T=0$, acceptance rate r +%\ENSURE +%\STATE Let $q$ be the decision-maker with highest acceptance rate in $\D$. +%\STATE $\D_q = \{(x, j, t, y) \in \D|j=q\}$ +%\STATE \hskip3.0em $\rhd$ $\D_q$ is the set of all observations judged by $q$ +%\STATE +%\STATE $\RR_q = \{(x, j, t, y) \in \D_q|t=1\}$ +%\STATE \hskip3.0em $\rhd$ $\RR_q$ is the set of observations in $\D_q$ with observed outcome labels +%\STATE +%\STATE Sort observations in $\RR_q$ in descending order of confidence scores $\s$ and assign to $\RR_q^{sort}$. +%\STATE \hskip3.0em $\rhd$ Observations deemed as high risk by the black-box model $\mathcal{B}$ are at the top of this list +%\STATE +%\STATE Remove the top $[(1.0-r)|\D_q |]-[|\D_q |-|\RR_q |]$ observations of $\RR_q^{sort}$ and call this list $\mathcal{R_B}$ +%\STATE \hskip3.0em $\rhd$ $\mathcal{R_B}$ is the list of observations assigned to $t = 1$ by $\mathcal{B}$ +%\STATE +%\STATE Compute $\mathbf{u}=\sum_{i=1}^{|\mathcal{R_B}|} \dfrac{\delta\{y_i=0\}}{| \D_q |}$. +%\RETURN $\mathbf{u}$ +%\end{algorithmic} +%\end{algorithm} + \end{itemize} \item Counterfactuals/Potential outcomes. \cite{pearl2010introduction} (also Rubin) \item Approach of Jung et al for optimal policy construction. \cite{jung2018algorithmic} \item Discussions of latent confounders in multiple contexts. @@ -451,47 +496,8 @@ We treat the observations as independent and the still the leniency would be a g \item Vanilla estimator of a model's performance. Obtained by first ordering the observations by the predictions assigned by the decider in the modelling step. \item Then 1-r \% of the most dangerous are detained and given a negative decision. The failure rate is computed as the ratio of negative outcomes to the number of subjects. \end{itemize} - \item Contraction - \begin{itemize} - \item Algorithm by Lakkaraju et al. Assumes that the subjects are assigned to the judges at random and requires that the judges differ in leniency. - \item Can estimate the true failure only up to the leniency of the most lenient decision-maker. - \item Performance is affected by the number of people judged by the most lenient decision-maker, the agreement rate and the leniency of the most lenient decision-maker. (Performance is guaranteed / better when ...) - \item Works only on binary outcomes - \item (We show that our method isn't constrained by any of these) - \item The algorithm goes as follows... -%\begin{algorithm}[] % enter the algorithm environment -%\caption{Contraction algorithm \cite{lakkaraju17}} % give the algorithm a caption -%\label{alg:contraction} % and a label for \ref{} commands later in the document -%\begin{algorithmic}[1] % enter the algorithmic environment -%\REQUIRE Labeled test data $\D$ with probabilities $\s$ and \emph{missing outcome labels} for observations with $T=0$, acceptance rate r -%\ENSURE -%\STATE Let $q$ be the decision-maker with highest acceptance rate in $\D$. -%\STATE $\D_q = \{(x, j, t, y) \in \D|j=q\}$ -%\STATE \hskip3.0em $\rhd$ $\D_q$ is the set of all observations judged by $q$ -%\STATE -%\STATE $\RR_q = \{(x, j, t, y) \in \D_q|t=1\}$ -%\STATE \hskip3.0em $\rhd$ $\RR_q$ is the set of observations in $\D_q$ with observed outcome labels -%\STATE -%\STATE Sort observations in $\RR_q$ in descending order of confidence scores $\s$ and assign to $\RR_q^{sort}$. -%\STATE \hskip3.0em $\rhd$ Observations deemed as high risk by the black-box model $\mathcal{B}$ are at the top of this list -%\STATE -%\STATE Remove the top $[(1.0-r)|\D_q |]-[|\D_q |-|\RR_q |]$ observations of $\RR_q^{sort}$ and call this list $\mathcal{R_B}$ -%\STATE \hskip3.0em $\rhd$ $\mathcal{R_B}$ is the list of observations assigned to $t = 1$ by $\mathcal{B}$ -%\STATE -%\STATE Compute $\mathbf{u}=\sum_{i=1}^{|\mathcal{R_B}|} \dfrac{\delta\{y_i=0\}}{| \D_q |}$. -%\RETURN $\mathbf{u}$ -%\end{algorithmic} -%\end{algorithm} - \end{itemize} - \item Potential outcomes / CBI - \begin{itemize} - \item Take test set - \item Compute the posterior for parameters and variables presented in equation \ref{eq:data_model}. - \item Using the posterior predictive distribution, draw estimates for the counterfactuals. - \item Impute the missing outcomes using the estimates from previous step - \item Obtain a point estimate for the failure rate by computing the mean. - \item Estimates for the counterfactuals Y(1) for the unobserved values of Y were obtained using the posterior expectations from Stan. We used the NUTS sampler to estimate the posterior. When the values for... - \end{itemize} + + \end{itemize} \paragraph{Results} (Target for this section from problem formulation: show that our evaluator is unbiased/accurate (show mean absolute error), robust to changes in data generation (some table perhaps, at least should discuss situations when the decisions are bad/biased/random = non-informative or misleading), also if the decider in the modelling step is bad and its information is used as input, what happens.)