Skip to content
Snippets Groups Projects
setting.tex 7.04 KiB
Newer Older
%!TEX root = sl.tex
% The above command helps compiling in TexShop on a MAc. Hitting typeset complies sl.tex directly instead of producing an error here.

\section{Setting and problem statement}
\label{sec:setting}
%\note{Antti}{Lakkaraju had many decision makers. Can we have just one or do we run into trouble somewhere? Perhaps need to add judge(s) at places.}  
Antti Hyttinen's avatar
Antti Hyttinen committed

%The setting we consider is described in terms of {\it two decision processes}.
Antti Hyttinen's avatar
Antti Hyttinen committed
%We consider a setting where a set of decision makers $\humanset=\{\human_\judgeValue\}$ make decisions for a set of cases.
%NO WE DO NOT CONSIDER THIS SETTING, THE SETTING IS THAT WE HAVE TO EVALUATE M BASED ON DATA
We consider data recorded from a decision making process with the following characteristics~\cite{lakkaraju2017selective}.
Michael Mathioudakis's avatar
Michael Mathioudakis committed
Each case is decided by one decision maker and we use $\judge$ as an index to the decision maker the case is assigned. 
For each such assignment, a decision maker $\human_\judgeValue$ (where $\judgeValue$ is a particular value for $\judge$)  considers a case described by a set of features \allFeatures and makes a binary decision $\decision \in\{0, 1\}$, nominally referred to as {\it positive} ($\decision = 1$) or {\it negative} ($\decision = 0$).
Michael Mathioudakis's avatar
Michael Mathioudakis committed
%
Intuitively, in our bail-or-jail example of Section~\ref{sec:introduction}, $\human_\judgeValue$ corresponds to the human judge deciding whether to grant bail ($\decision = 1$) or not ($\decision = 0$).
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
The decision is followed with a binary outcome $\outcome$, which is nominally referred to as {\it successful} ($\outcome = 1$) or {\it unsuccessful} ($\outcome = 0$).
An outcome can be {\it unsuccessful} ($\outcome = 0$) only if the decision that preceded it was positive ($\decision = 1$).
If the decision was not positive ($\decision = 0$), then the outcome is considered by default successful ($\outcome = 1$).
Back in our example, the decision of the judge is unsuccessful only if the judge grants bail ($\decision = 1$) but the defendant violates its terms ($\outcome = 0$).
Otherwise, if the decision of the judge was to keep the defendant in jail ($\decision = 0$), the outcome is by default successful ($\outcome = 1$) since there can be no bail violation.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
%Moreover, we assume that decision maker \human is associated with a leniency level $\leniency$, which determines the fraction of cases for which they produce a positive decision, in expectation. 
Antti Hyttinen's avatar
Antti Hyttinen committed
%Formally, for leniency level $\leniency = r\in [0, 1]$, we have
%\begin{equation}
%	P(\decision = 1 | \leniency = \leniencyValue) = \sum_{\allFeatures} P(\decision = 1, \allFeatures~|~\leniency = \leniencyValue) = \leniencyValue .
%\end{equation}
%Antti I think this formula is mostly misleading
Michael Mathioudakis's avatar
Michael Mathioudakis committed
%
% This is useful when we model decision makers with different leniency levels or want to refer to the subjects each makes a decision for.

For each case a record $(\judgeValue, \obsFeaturesValue, \decisionValue, \outcomeValue)$ is produced that contains only observations on a subset $\obsFeatures\subseteq \allFeatures$ of the features of the case, the decision $\decision$ of the judge and the outcome $\outcome$ -- but leaves no trace for a subset $\unobservable = \allFeatures \setminus \obsFeatures$ of the features.
Intuitively, in our example, $\obsFeatures$ corresponds to publicly recorded information about the bail-or-jail case decided by the judge (e.g., the harshness of the possible crime) and $\unobservable$ corresponds to features that are observed by the judge but do not appear on record (e.g., exact verbal response of the defendant in court).
Antti Hyttinen's avatar
Antti Hyttinen committed
The set of records $\dataset = \{(\judgeValue, \obsFeaturesValue, \decisionValue, \outcomeValue)\}$ %produced by decision maker \human
Michael Mathioudakis's avatar
Michael Mathioudakis committed
 comprises what we refer to as the {\bf dataset}.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
% -- and the dataset generally includes records from \emph{more than one} decision makers, indexed by $\judgeValue$.
Figure~\ref{fig:causalmodel} shows the causal diagram of this decision making process.
Antti Hyttinen's avatar
Antti Hyttinen committed
Based on the recorded data, we wish to evaluate a decision maker \machine that considers a case from the dataset -- and makes its own binary decision $\decision$ based on the recorded features $\obsFeatures$.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
%, followed by a binary outcome $\outcome$.
Antti Hyttinen's avatar
Antti Hyttinen committed
In our example, \machine corresponds to a machine-based automated decision making system that is considered for replacing the human judge in bail-or-jail decisions.
% Notice that we assume \machine has access only to some of the features that were available to \human, to model cases where the system would use only the recorded features and not other ones that would be available to a human judge.
Michael Mathioudakis's avatar
Michael Mathioudakis committed
For decision maker \machine, the definition and semantics of decision $\decision$ and outcome $\outcome$ are the same as for decision makers \humanset, described above.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
The quality of a decision maker $\machine$ is measured in terms of its {\bf failure rate} \failurerate -- i.e., the fraction of undesired outcomes ($\outcome=0$) out of all the cases for which a decision is made. 
A good decision maker achieves as low failure rate \failurerate  as possible.
Note, however, that a decision maker that always makes a negative decision $\decision=0$, has failure rate $\failurerate = 0$, by definition.
Antti Hyttinen's avatar
Antti Hyttinen committed
%To produce sensible evaluation of decision maker at varying leniency levels.
%Moreover, decision maker \machine is also associated with a leniency level $\leniency$, defined as before for \human.
%
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
Thus the evaluation to be meaningful, we evaluate decision makers at the different leniency levels $\leniency$.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
%Ultimately, our goal is to obtain an estimate of the failure rate \failurerate for a decision maker \machine.
\begin{problem}[Evaluation]
Antti Hyttinen's avatar
Antti Hyttinen committed
Given a dataset $\{(\judgeValue, \obsFeaturesValue, \decisionValue, \outcomeValue)\}$, and a decision maker \machine, provide an estimate of the failure rate \failurerate at a given leniency level $R=r$.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
\label{problem:the}
\end{problem}
\noindent
The main challenge in estimating \failurerate is that in general the dataset does not directly provide a way to evaluate \failurerate. 
Antti Hyttinen's avatar
Antti Hyttinen committed
In particular, let us consider the case where we wish to evaluate decision maker \machine\ -- and suppose that \machine is making a decision for the case corresponding to record 
%$(\judgeValue, \obsFeaturesValue, \decisionValue, \outcomeValue)$,
 based on the recorded features \obsFeaturesValue.
Antti Hyttinen's avatar
Antti Hyttinen committed
Suppose also that the decision in the data was negative, $\decision = 0$, in which case the outcome is always positive, $\outcome = 1$.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed
If the decision by \machine is $\decision = 1$, then it is not possible to tell directly from the dataset what its outcome $\outcome$ would be.
Michael Mathioudakis's avatar
Michael Mathioudakis committed
The approach we take to deal with this challenge is to use counterfactual reasoning to infer $\outcome$ if we had $\decision = 1$, as detailed in Section~\ref{sec:imputation} below.
Antti Hyttinen's avatar
J.  
Antti Hyttinen committed

Antti Hyttinen's avatar
Antti Hyttinen committed


%Sometimes, we may have control over the leniency level of the decision maker we evaluate.
%In such cases, we would like to evaluate decision maker $\machine = \machine(\leniency = \leniencyValue)$ at various leniency levels $\leniency$.
%Ideally, the estimate returned by the evaluation should also be accurate for all levels of leniency.