@@ -18,7 +18,7 @@ As digitalisation affects more and more aspects of life, so do the automated dec
This is quite prevalent for web services, where algorithms decide what results are shown by search engines, what news stories appear on social media platforms, or what products are recommended on online stores.
But automated-decision systems start to appear also in other situations -- for credit scoring, choice of medical treatment, insurance pricing, but also judicial decisions (COMPAS~\cite{brennan2009evaluating} and RisCanvi~\cite{tolan2019why} are two examples of algorithmic tools used to evaluate the risk for recidivism in the US and Catalan prison system, respectively).
In comparison to human decision making, automatic decision making by AI systems holds promise for a better decision quality~\cite{kleinberg2018human}, and possibly even guarantees of fairness~\cite{DBLP:conf/icml/Kusner0LS19}.
In comparison to human decision making, automatic decision making by AI systems holds promise for a better decision quality~\cite{kleinberg2018human,royal}, and possibly even guarantees of fairness~\cite{DBLP:conf/icml/Kusner0LS19}.
But before deploying machines to make automated decisions, there is a need to evaluate their performance in real settings.
In practice, this is done by simulating their deployment over a log of past cases and measuring how well they would have performed, if they had been used to replace the human decision makers or other decision system currently in place.
Herein lies a challenge: previously-made decisions (based on case features some of which are not recorded in the data) affect the data on which the evaluation is performed, in a way that prevents straightforward evaluation.
...
...
@@ -59,7 +59,9 @@ At its core, our task is to answer a `what-if' question, asking ``what would the
% SELECTION BIAS
Settings where data samples are chosen through some intricate filtering mechanism are said to exhibit {\it selection bias} (see, for example, \citet{hernan2004structural}).
%
In such settings, to train a model to predict outcomes, the direct approach is to perform the training only on data samples where the decision was positive.
In our setting, any model predicting outcomes can only directly use data samples where the decision was positive.
% IN SELECTION BIAS CASES GENERALLY THERE ARE NO POSITIVE DECISION
%THE ABOVE IS A BIT VAGUE...
% MISSING DATA %IMPUTATION
Settings where some variables are not observed for all samples have \emph{missing data} (see, for example, \citet{little2019statistical}). In the setting of our work, the outcomes for samples with a negative decision are considered missing, or labeled with some default value.