% There is a rich literature on problems that arise in settings similar to ours.
%
% The works discussed herein are indicative, but not the complete literature.
% COUNTERFACTUALS: FUNDAMENTAL PROBLEM
At its core, our task is to answer a `what-if' question, i.e., ``what would the outcome have been if a different decision had been made'' (a counterfactual) -- often mentioned as the `fundamental problem' in causal inference~\cite{holland1986statistics}.
% SELECTION BIAS
Settings where data samples are chosen through some filtering mechanism are said to exhibit {\it selection bias}~\cite{hernan2004structural}.
%
% In our setting, any model predicting outcomes can only directly use data samples where the decision was positive.
% MISSING DATA %IMPUTATION
Settings where some variables are not observed for all samples have \emph{missing data}~\cite{little2019statistical}.
%
% In our setting, the outcomes for samples with a negative decision are considered missing, or labeled with some default value.
%
\emph{Latent confounding} refers to the presence of unobserved variables that affect two or more of the observed variables~\cite{pearl2010introduction}.
%
% In our setting, there are generally features not recorded in the data that affect both the decision and the outcome.
%Research on selection bias has achieved results in recovery the structure of the generative model (i.e., the mechanism that results in bias) and estimating causal effects (e.g.,~\cite{pearl1995empirical} and~\cite{bareinboim2012controlling}).
%OFFLINE POLICY EVALUATION
\emph{Offline policy evaluation} refers to the assessment of a decision policy over a dataset recorded under another policy~\cite{Jung2} -- which also matches the setting of this work.
%COUNFOUNDING AND SENSITIVITY ANALYSIS
In this work, we adopted the setting of~\cite{lakkaraju2017selective}, and showed that causally informed counterfactual imputation can achieve accurate results.
We adopted the setting of~\cite{lakkaraju2017selective}, and showed that causally informed counterfactual imputation can achieve accurate results.
%PERHAPS WE DONT NEED TIHS
% The setting allows for unobserved confounding, and so it cannot be addressed with standard methods for processing missing data, which typically make strong {\it ignorability} or {\it missing at random (MAR)} conditions~\cite{DBLP:conf/icml/DudikLL11,bang2005doubly,little2019statistical}.
%
% In our simulations we compared in particular to \contraction of~\cite{lakkaraju2017selective}, an approach that is appealing in its simplicity.
% %
% However, as our experiments confirm, it is quite sensitive to the number of subjects assigned to (the most) lenient decision makers.
% %
In addition, Kleinberg et.al.~\cite{kleinberg2018human} present an in-detail account of employing \contraction on real data.
In addition, Kleinberg et.al.~\cite{kleinberg2018human} present an in-detail account of employing \contraction on real data.
%
In their experiments, they use a decision maker that is set-up similarly to \independent decision makers discussed in our work -- but that makes decisions not based on leniency, but a threshold determined by cost or utility values.
...
...
@@ -48,23 +29,51 @@ In comparison, \cfbi uses rigorous causal modelling to account for leniency and
%with random decision makers that violate
the expert consistency assumption of \cite{dearteaga2018learning} is violated. % and a particular type of imputation.
%% MM: I take out the following, they are less related and we are out of space
\iffalse
In reinforcement learning, a related scenario is that of offline policy evaluation, where the objective is to determine a quality of a policy from data recorded under some other baseline policy \cite{Jung2,DBLP:conf/icml/ThomasB16}.
%
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling decision makers with multiple leniencies.
%
Mc-Candless et al. perform Bayesian sensitivity analysis while taking into account latent confounding~\cite{mccandless2007bayesian,mccandless2017comparison}.
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling decision makers with multiple leniencies. Mc-Candless et al. perform Bayesian sensitivity analysis while taking into account latent confounding~\cite{mccandless2007bayesian,mccandless2017comparison}.
%
\cite{kallus2018confounding} obtain improved policies from data possibly biased by a baseline policy.
\fi
\iffalse
% There is a rich literature on problems that arise in settings similar to ours.
%
% The works discussed herein are indicative, but not the complete literature.
% COUNTERFACTUALS: FUNDAMENTAL PROBLEM
%At its core, our task is to answer a `what-if' question, i.e., ``what would the outcome have been if a different decision had been made'' (a counterfactual) -- often mentioned as the `fundamental problem' in causal inference~\cite{holland1986statistics}.
%I THINK THIS SOUNDS A BIT SILLY
% SELECTION BIAS
More generally, settings where data samples are chosen through some filtering mechanism are said to exhibit {\it selection bias}~\cite{hernan2004structural}.
%
% In our setting, any model predicting outcomes can only directly use data samples where the decision was positive.
% MISSING DATA %IMPUTATION
\emph{Latent confounding} refers to the presence of unobserved variables that affect two or more of the observed variables~\cite{pearl2010introduction}.
Settings where some variables are not observed for all samples have \emph{missing data}~\cite{little2019statistical}.
Our setting violates the typical strong assumptions of {\it ignorability} or {\it missing at random (MAR)}.
%
% In our setting, the outcomes for samples with a negative decision are considered missing, or labeled with some default value.
%
%
% In our setting, there are generally features not recorded in the data that affect both the decision and the outcome.
%Research on selection bias has achieved results in recovery the structure of the generative model (i.e., the mechanism that results in bias) and estimating causal effects (e.g.,~\cite{pearl1995empirical} and~\cite{bareinboim2012controlling}).
%OFFLINE POLICY EVALUATION
%\emph{Offline policy evaluation} refers to the assessment of a decision policy over a dataset recorded under another policy~\cite{Jung2} -- which also matches the setting of this work.
%% MM: I take out the following, they are less related and we are out of space
%\iffalse
%
%\fi
%\iffalse
The effectiveness of causal modelling and use of counterfactuals is also demonstrated in recent work on algorithmic fairness~\cite{DBLP:conf/icml/NabiMS19,DBLP:conf/icml/Kusner0LS19,coston2020counterfactual,madras2019fairness,corbett2017algorithmic,DBLP:journals/jmlr/BottouPCCCPRSS13,DBLP:conf/icml/JohanssonSS16}.
%
Several works study selection bias or missing data in the context of identifiability of causal effects and causal structure~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical,Bareinboim2014:selectionbias,smr1999,Mohan2013,Shpitser2015}.
%Several works study selection bias or missing data in the context of identifiability of causal effects and causal structure~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical,Bareinboim2014:selectionbias,smr1999,Mohan2013,Shpitser2015}.
%
%Also identifiability questions in the presence of selection bias or missing data mechanisms require detailed causal modelling~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical}.
%To properly assess decision procedures for their performance and fairness we need to understand the causal relations
Finally, more applied work on automated decision making and risk scoring, related in particular to recidivism, can be found for example in~\cite{murder,tolan2019why,kleinberg2018human,chouldechova2017fair,brennan2009evaluating,royal}.
\fi
Finally, more applied work related in particular to recidivism, can be found for example in~\cite{murder,tolan2019why,kleinberg2018human,chouldechova2017fair,brennan2009evaluating,royal}.