@@ -34,7 +34,7 @@ In comparison, \cfbi uses rigorous causal modelling to account for leniency and
In reinforcement learning, a related scenario is that of offline policy evaluation, where the objective is to determine a quality of a policy from data recorded under some other baseline policy \cite{Jung2,DBLP:conf/icml/ThomasB16}.
In particular, Jung et al. \cite{Jung2} consider sensitivity analysis in a similar scenario as ours, but without directly modelling decision makers with multiple leniencies. Mc-Candless et al. perform Bayesian sensitivity analysis while taking into account latent confounding~\cite{mccandless2007bayesian}.
Kallus et al. obtain improved policies from data possibly biased by a baseline policy~\cite{kallus2018confounding}.
The effectiveness of causal modelling and counterfactuals is also demonstrated in recent work on e.g. fairness~\cite{DBLP:conf/icml/Kusner0LS19,coston2020counterfactual,madras2019fairness,corbett2017algorithmic,DBLP:conf/aaai/ZhangB18}.
