%!TEX root = sl.tex % The above command helps compiling in TexShop on a MAc. Hitting typeset complies sl.tex directly instead of producing an error here. \section{Counterfactual-Based Imputation For Selective Labels} \label{sec:imputation} Problem~\ref{problem:the} is challenging because the dataset does not directly provide a way to evaluate \failurerate. % If decision maker \machine makes a positive decision for a case for which the dataset has negative decision by a decision maker $\human_\judgeValue$, how can we infer the outcome \outcome in the hypothetical case where \machine's decision had been followed? % Such questions fall straight into the realm of causal analysis and particularly the evaluation of counterfactuals~\cite{coston2020counterfactual} -- an approach that we follow in this paper. % The challenges we face are two-fold. % % % Firstly, we generally do not have direct observations for the outcome under all of \machine's positive decisions. % A first thought is to simply {\it predict} the outcomes based on the features of the case. % In the bail-or-jail example, we could investigate whether certain features of the defendant (e.g., their age and marital status) are good predictors of whether they comply to the bail conditions -- and use them if they do. % However, not all features that are available to $\human_\judgeValue$ are available to \machine in the setting we consider, which forms our second major challenge. % These complications mean that making direct predictions based on the available features can be suboptimal and even biased. % However, important information regarding the unobserved features \unobservable can often be recovered via careful consideration of the decisions in the data which our counterfactual approach achieves~\cite{Jung2,mccandless2007bayesian}. For illustration, let us consider a defendant who received a negative decision by a human judge $\human_\judgeValue$. % Suppose also that, among defendants with similar recorded features \obsFeatures who were released, none violated the bail conditions -- and therefore, judging from observations alone, the defendant should be considered safe to release based on \obsFeatures. % However, if the judge was both lenient and precise -- i.e., was able to make those positive decisions that lead to successful outcome -- then it is very possible that the negative decision is attributed to unfavorable non-recorded features \unobservable. % And therefore, if a positive decision were made, {\it the above reasoning suggests that a negative outcome is more likely than what would have been predicted based alone on the recorded features \obsFeatures of released defendants}. Our approach for evaluating $\machine$ on cases where negative decision by $\human_\judgeValue$ is recorded in the data, unfolds over three steps: first, we learn a causal model over the dataset; then, we compute counterfactuals to predict unobserved outcomes; and finally, we use these predictions to evaluate a set of decisions by \machine. % \note{Michael}{Actually, the paragraph above describes a scenario where {\it labeled outcomes} and possibly {\it contraction} would fail. Specifically, create cases where: % (i) Z has much larger coefficient than X, and (ii) the judge is good (the two logistic functions for judge decision and outcome are the same), and (iii) the machine is trained on labeled outcomes. The machine will see that the outcome is successful regardless of X, because Z will dominate the positive (and negative) decisions. So it will learn that everyone can be released. Labeled outcomes will evaluate the machine as good -- but our approach will uncover its true performance.} \begin{figure} \begin{center} \begin{tikzpicture}[->,>=stealth',node distance=1.5cm, semithick] \tikzstyle{every state}=[fill=none,draw=black,text=black] \node[state] (R) [ellipse] at (0,1.0){\hspace*{0mm}$\judge$: {\small Decision maker index }\hspace*{-4mm}}; \node[state] (X) [ellipse] at (4.5,1) {\hspace*{0mm}$\obsFeatures$: {\small Observed features}\hspace*{-3mm}}; \node[state] (T) [ellipse] at (2,0) {\hspace*{0mm}$\decision$: {\small Decision }\hspace*{-2mm}}; \node[state] (Z) [rectangle] at (4.5,-1) {$\unobservable$: {\small Unobserved features}}; \node[state] (Y) [ellipse] at (7,0) {\hspace*{0mm}$\outcome$: {\small Outcome}\hspace*{-2mm}}; \path (R) edge (T) (X) edge (T) edge (Y) (Z) edge (T) edge (Y) (T) edge (Y); % \draw [->] (2.2,1.6) to (1.8,0.4); % \draw [->] (4.8,1.6) to (5.2,0.4); \end{tikzpicture} \end{center} \caption{Causal diagram for the selective labels setting.} % \caption{The causal diagram of decision making in the selective labels setting. $\decision$ is a binary decision, $\outcome$ is the outcome that is selectively labeled based on $\decision$. Background features $\obsFeatures$ for a subject affect the decision and the outcome. $\judge$ specifies the decision maker assignment, allowing us to model several decision makers with varying leniency. Importantly, decisions and outcomes may depend on additional latent background features $\unobservable$ not recorded in the data.} \label{fig:causalmodel} \end{figure} \subsection{The Causal Model} \label{sec:model_definition} Recall from Section~\ref{sec:setting} that Figure~\ref{fig:causalmodel} provides the structure of causal relationships for quantities of interest. We use the following causal model over this structure, building on what is used by Lakkaraju et al.~\cite{lakkaraju2017selective} and others~\cite{Jung2,mccandless2007bayesian}. % First, we assume that the % the observed feature vectors \obsFeatures and unobserved features \unobservable can be modeled as a (continuous) one-dimensional risk factor~\cite{austin2011introduction,mccandless2007bayesian,rosenbaum1983central}. Motivated by the central limit theorem, we use a Gaussian distribution for it, and since $\unobservable$ is unobserved we can assume its variance to be 1 without loss of generality, thus $\unobservable \sim N(0,1)$. %, for example by using propensity scores~\cite{}. % Moreover, we are also going to present our modeling approach for the case of a single observed feature \obsFeatures -- this is done only for simplicity of presentation, as it is straightforward to extend the model to the case of multiple features \obsFeatures. %WELL ACTUALLY REASONING IS NOT THIS BUT THE GENERAL POINT %ABOUT PROPENSITY SCORE; IT IS BETTER CONFIDENCE ALL OBSERVATIONS TO A SINGLE VARIABLE, % we do this only in the compas section and it is not advertised there %, as we do in the experiments (Section~\ref{sec:experiments}). % %Motivated by the central limit theorem we model both \obsFeatures and \unobservable with Gaussian distributions. %% % RL: LAST TWO SENTENCES ARE AND SHOULD BE IN EXPERIMENTS? SHOULD NOT MATTER WHILE MODELLING. % (Any deviation from this can be achieved by adjusting intercepts and coefficients in the following). In our setting, a negative decision $\decision=0$ leads to a successful outcome $\outcome=1$. % When $\decision=1$, the outcome is modeled with a logistic regression model over the features $\obsFeatures$ and $\unobservable$: \begin{eqnarray} \prob{\outcome=1~|~\decision, \obsFeaturesValue, \unobservableValue} & =& \begin{cases} 1,~\text{if}~\decision = 0\\ \invlogit(\alpha_\outcome + \beta_\obsFeatures \obsFeaturesValue + \beta_\unobservable \unobservableValue ),~\text{o/w} \label{eq:defendantmodel} \end{cases} \end{eqnarray} Here \invlogit is the standard logistic function. % % Since the decisions are ultimately based on expected behaviour, We model the decisions in the data similarly with a logistic regression:%, since the decisions are ultimately based on expected behaviour, according to a logistic regression over the features: \begin{equation} \prob{\decision = 1~|~\judgeValue,\obsFeaturesValue, \unobservableValue} = \invlogit(\alpha_\judgeValue + \gamma_\obsFeatures \obsFeaturesValue + \gamma_\unobservable \unobservableValue ) \label{eq:judgemodel} \end{equation}% % Although we model the decision makers here probabilistically, we do not imply that their decisions are necessarily probabilistic. The probabilistic model arises from the unknown specific details of reasoning employed by each decision maker $\human_\judgeValue$. Note also that we are making the simplifying assumption that coefficients $\gamma_\obsFeatures,\gamma_\unobservable$ are the same for all $\human_\judgeValue$, but decision makers are allowed to differ in intercept $\alpha_\judgeValue$. % Parameter $\alpha_{\judgeValue}$ controls the leniency of a decision maker $\human_\judgeValue \in \humanset$. We take a Bayesian approach to learn the model from the dataset. % In particular, we consider the full probabilistic model defined in Equations \ref{eq:defendantmodel} and \ref{eq:judgemodel} and obtain the posterior distribution of its parameters $\parameters = \{ \alpha_\outcome, \beta_\obsFeatures, \beta_\unobservable, \gamma_\obsFeatures, \gamma_\unobservable\} \cup \bigcup_{\human_\judgeValue \in \human} \{\alpha_\judgeValue\}$, which includes intercepts $\alpha_\judgeValue$ for all $\human_\judgeValue$ employed in the data. We use suitable prior distributions to ensure the identifiability of the parameters (Appendix~\ref{sec:priors}). \subsection{Computing Counterfactual Outcomes} We remind that the goal is to provide a solution to Problem~\ref{problem:the} -- and, to do that, we wish to address those cases where $\machine$ decides $\decision = 1$ while the data has a negative decision $\decision = 0$, where evaluation cannot be performed directly. % In other words, we wish to answer a `what-if' question: for each specific case where a decision maker $\human_\judgeValue$ decided $\decision = 0$, what if we had intervened to alter the decision to $\decision = 1$? % In the formalism of causal inference~\cite{pearl2010introduction}, we wish to evaluate the counterfactual expectation \begin{align} \cfoutcome = & \expectss{\decision \leftarrow 1}{\outcome~| \obsFeaturesValue, \judgeValue, \decision = 0; \dataset} \label{eq:counterfactual} \end{align} The expression above concerns a specific entry in the dataset with features $\obsFeatures=x$, for which decision maker $\human_\judgeValue$ made a decision $\decision = 0$. % It expresses the probability that the outcome would have been positive ($\outcome = 1$) had the decision been positive ($\decision = 1$), conditional on what we know from the data entry ($\obsFeatures = \obsFeaturesValue$, $\decision = 0$, $\judge = \judgeValue$) as well as from the entire dataset \dataset. % Notice that the presence of \dataset in the conditional part of~\ref{eq:counterfactual} gives us more information about the data entry compared to the entry-specific quantities %($\obsFeatures = \obsFeaturesValue$, $\decision = 0$) and is thus not redundant. % In particular, it provides information about the leniency and other parameters of the decision maker $\human_\judgeValue$, which in turn is important to infer information about the unobserved variables \unobservable, as discussed in the beginning of this section. For the model defined above, the counterfactual $\hat{Y}$ can be computed by the approach of Pearl \cite{pearl2010introduction}. For a fully defined model (with fixed parameters) the counterfactual expectation can be determined by the following expression: \begin{align} E_{\decision \leftarrow 1}(\outcome|\judgeValue,\decision=0,\obsFeaturesValue) &= \int \prob{\outcome=1|\decision=1,\obsFeaturesValue,\unobservableValue} \prob{\unobservableValue|\judgeValue, \decision=0,\obsFeaturesValue} \diff{\unobservableValue} \label{eq:counterfactual_eq} \end{align} In essence, we determine the distribution of the unobserved features $\unobservable$ using the decision, observed features $\obsFeaturesValue$, and the leniency of the employed decision maker, and then determine the distribution of $\outcome$ conditional on all features, integrating over the unobserved features (Appendix~\ref{sec:counterfactuals}). Note that the decision maker model in Equation~\ref{eq:judgemodel} affects the distribution of the unobserved features $\prob{\unobservable|\judgeValue, \decision=0,\obsFeaturesValue}$. Having obtained a posterior probability distribution for parameters \parameters we can estimate the counterfactual outcome value based on the data: \begin{equation} \cfoutcome = \int \prob{\outcome=1|\decision=1,\obsFeaturesValue,\unobservableValue,\parameters} \prob{\unobservableValue| \judgeValue, \decision=0,\obsFeaturesValue,\parameters} \diff{\unobservableValue}\prob{\parameters | \dataset} \diff{\parameters} \label{eq:theposterior} \end{equation} Note that, for all data entries other than the ones with $\decision = 0$ we trivially have \cfoutcome = \outcome where \outcome is the outcome recorded in the dataset \dataset. %\spara{Implementation} The result of Equation~\ref{eq:theposterior} can be computed numerically: \begin{equation} \cfoutcome \approx = \frac{1}{N} \sum_{k=1}^{N} % \prob{\outcome = 1 | \decision = 1, \obsFeaturesValue, \unobservableValue_k, \theta_k} \label{eq:expandcf} \end{equation} where the sums are taken over $N$ samples of $\parameters$ and $\unobservable$ obtained from their respective posteriors. % In practice, we use the MCMC functionality of Stan %\footnote{\url{https://mc-stan.org/}} to obtain these samples. \subsection{Evaluating Decision Makers} Expression~\ref{eq:expandcf} gives us a direct way to evaluate the outcome of a positive decisions for any data entry for which $\decision = 0$. % Note though that, unlike $\outcome$ that takes integer values $\{0, 1\}$, \cfoutcome may take also fractional values $\cfoutcome \in [0, 1]$. Having obtained outcome estimates for all data entries, it is now straightforward to obtain an estimate for the failure rate $\failurerate$ of decision maker \machine: it can be computed as a simple average over all data entries. % Our approach is summarized in Figure~\ref{fig:approach}. % We will refer to it as \textbf{\cfbi}, for {\underline c}ounter{\underline f}actual-{\underline b}ased {\underline i}mputation. \begin{figure}[t!] \begin{center} %\includegraphics[height=2in]{img/setting} \includegraphics[width=0.95\columnwidth]{img/fig3_antti} \end{center} \caption{\cfbi. % Negative decisions ($\decision = 0$) by decision maker $\machine$ are evaluated as successful ($\cfoutcome = 1$), shown with dashed arrows. % Positive decisions ($\decision = 1$) by decision maker $M$ for which the decision in the data was also positive ($\decision = 1$) are evaluated according to the outcome $\outcome$ in the data, as marked by the solid arrow. % For the remaining cases (second and third), the evaluated outcomes $\cfoutcome$ are based on our counterfactual imputation technique. The failure rate of the decision maker $\machine$ is $2.7/7=38.6 \%$ here. } \label{fig:approach} \end{figure}