%Rikus point: other papers do not really explain why decisions help in predictions
In this paper we considered the overall setting as formulated by~\citet{lakkaraju2017selective}, and showed that causally informed counterfactual imputation can achieve accurate results.
We adopted the overall setting as formulated by~\citet{lakkaraju2017selective}, and showed that causally informed counterfactual imputation can achieve accurate results.
%PERHAPS WE DONT NEED TIHS
% building on~\cite{Jung2,mccandless2007bayesian,dearteaga2018learning},
%In addition to Lakkaraju et al.~\citet{lakkaraju2017selective} which we build upon, several papers consider related problems to ours.
Note that our setting allowing for unobserved confounding does not fulfill the ignorability or missing at random (MAR) conditions often assumed when processing missing data, biasing any methods built on these assumptions~\cite{lakkaraju2017selective,DBLP:conf/icml/DudikLL11,bang2005doubly,little2019statistical}.
The setting allows for unobserved confounding, and so it cannot be addressed with standard methods for processing missing data, which typiclly make strong {\it ignorability} or {\it missing at random (MAR)} conditions~\cite{DBLP:conf/icml/DudikLL11,bang2005doubly,little2019statistical}. % MM: I removed lakkaraju2017selective from the list, we do not want to give the impression that lakkaraju2017selective makes these assumptions
In our simulations we compared in particular to \contraction of~\citet{lakkaraju2017selective}, an approach that is appealing in its simplicity.
%
...
...
@@ -19,30 +19,29 @@ However, as our experiments confirm, it is quite sensitive to the number of subj
%
In addition, \citet{kleinberg2018human} present an in-detail account of employing \contraction in a real data.
%
In their experiments, they use a decision maker that is setup similarly to \independent decision makers discussed in our work, but who make decisions not based on a leniency, but a threshold determined by cost or utility values.
In their experiments, they use a decision maker that is setup similarly to \independent decision makers discussed in our work -- but that makes decisions not based on leniency, but a threshold determined by cost or utility values.
In contrast to our imputation approach, De-Arteaga et al.~\cite{dearteaga2018learning} directly impute decisions as outcomes and consider learning automatic decision makers from such augmented data.
Unlike our imputation approach (\cfbi), De-Arteaga et al.~\cite{dearteaga2018learning} directly impute decisions as outcomes and consider learning automatic decision makers from such augmented data.
%
\citet{kleinberg2018human} use a multiplicative correction term to adjust the bias observed for more conventional imputation.
%
In comparison, our approach is based on a rigorous causal model accounting for different leniencies and unobservables, and gives accurate results even with random decision makers that violate the expert consistency assumption of \cite{dearteaga2018learning}. % and a particular type of imputation.
In comparison, \cfbi is based on a rigorous causal model that accounts for leniency and unobservables, and gives accurate results even with random decision makers that violate the expert consistency assumption of \cite{dearteaga2018learning}. % and a particular type of imputation.
In reinforcement learning a related scenario is consider as offline policy evaluation, where the objective is to determine a quality of a policy from data recorded under some other baseline policy \cite{Jung2,DBLP:conf/icml/ThomasB16}.
In reinforcement learning, a related scenario is that of offline policy evaluation, where the objective is to determine a quality of a policy from data recorded under some other baseline policy \cite{Jung2,DBLP:conf/icml/ThomasB16}.
%
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling judges with multiple leniencies.
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling decision makers with multiple leniencies.
%
Mc-Candless et al. perform Bayesian sensitivity analysis while taking into account latent confounding~\cite{mccandless2007bayesian,mccandless2017comparison}.
%
\citet{kallus2018confounding} obtain improved policies from data possibly biased by a baseline policy.
The importance of in-detail causal modeling and evaluating counterfactual outcomes, as observed also here, is particularly prominent in recent work on fairness of automatic decision making~\cite{DBLP:conf/icml/NabiMS19,DBLP:conf/icml/Kusner0LS19,coston2020counterfactual,madras2019fairness,corbett2017algorithmic,DBLP:journals/jmlr/BottouPCCCPRSS13,DBLP:conf/icml/JohanssonSS16}.
The effectiveness of causal modeling and use of counterfactuals is also demonstrated in recent work on algorithmic fairness~\cite{DBLP:conf/icml/NabiMS19,DBLP:conf/icml/Kusner0LS19,coston2020counterfactual,madras2019fairness,corbett2017algorithmic,DBLP:journals/jmlr/BottouPCCCPRSS13,DBLP:conf/icml/JohanssonSS16}.
%
Several works study selection bias or missing data in the context of identifiability of causal effects and causal structure~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical,Bareinboim2014:selectionbias,smr1999,Mohan2013,Shpitser2015}.
%
%Also identifiability questions in the presence of selection bias or missing data mechanisms require detailed causal modeling~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical}.
%To properly assess decision procedures for their performance and fairness we need to understand the causal relations
Finally, more applied work on automated decision making and risk scoring, related in particular to recidivism, can be found for example in~\cite{murder,tolan2019why,kleinberg2018human,chouldechova2017fair,brennan2009evaluating,royal}.