@@ -34,7 +34,7 @@ In comparison, \cfbi uses rigorous causal modelling to account for leniency and
In reinforcement learning, a related scenario is that of offline policy evaluation, where the objective is to determine a quality of a policy from data recorded under some other baseline policy \cite{Jung2,DBLP:conf/icml/ThomasB16}.
%
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling decision makers with multiple leniencies. Mc-Candless et al. perform Bayesian sensitivity analysis while taking into account latent confounding~\cite{mccandless2007bayesian}.
%,mccandless2017comparison
%,,mccandless2017comparison
Kallus et al. obtain improved policies from data possibly biased by a baseline policy~\cite{kallus2018confounding}.
% There is a rich literature on problems that arise in settings similar to ours.
The effectiveness of causal modelling and counterfactuals is also demonstrated in recent work on e.g. fairness~\cite{DBLP:conf/icml/Kusner0LS19,coston2020counterfactual,madras2019fairness,corbett2017algorithmic,DBLP:conf/aaai/ZhangB18}.
%Several works study selection bias or missing data in the context of identifiability of causal effects and causal structure~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical,Bareinboim2014:selectionbias,smr1999,Mohan2013,Shpitser2015}.
%
%Also identifiability questions in the presence of selection bias or missing data mechanisms require detailed causal modelling~\cite{bareinboim2012controlling,hernan2004structural,little2019statistical}.