In this paper we considered evaluation of (automatic) decision makers, which is vitally needed for the current aims of replacing human decision making with different kinds of automatic decision making procedures. The challenge in this is that the evaluation often needs to be based on data where present decisions imply selective labeling and missing data, thus biasing any standard statistical data analysis results. We showed that with proper causal modeling, automatic decision makers can be evaluated even based on such selectively labeled data. Contrary to the previous methods, our proposed approach allows for more accurate evaluations, with less variation, also in settings that evaluation was not possible before.
In this paper we considered evaluation of (automatic) decision makers, which is vitally needed for the current aims of replacing human decision making with different kinds of automatic decision making procedures. The challenge in this is that the evaluation often needs to be based on data where present decisions imply selective labeling and missing data, thus biasing any standard statistical data analysis results. We showed that with proper causal modeling, automatic decision makers can be evaluated even based on such selectively labeled data. Contrary to the previous methods, our proposed approach allows for more accurate evaluations, with less variation, also in settings that evaluation was not possible before.
In the future we will examine further generalizing the setting and modeling assumption: more intricate differences in decision maker's behaviour could be modeled e.g. by hiearchical Bayesian modeling.
In the future we will examine further generalizing the setting and modeling assumption: more intricate differences in decision maker's behaviour could be modeled e.g. by hierarchical Bayesian modeling.
%
%
Since our approach predicts outcomes based on decision made by educated decision makers, it is an open question, whether this information can be used also when learning the statistical models the automatic decision makers are ultimately based on.
Since our approach predicts outcomes based on decision made by educated decision makers, it is still unclear how much this benefits the estimation the statistical models the automatic decision makers are ultimately based on~\cite{dearteaga2018learning}.
%
%
We believe such approaches will allow for better evaluations in new application fields, ensuring the accuracy and fairness of automatic decision making procedures that can be then adopted in the society.
We believe such approaches will allow for better evaluations in new application fields, ensuring the accuracy and fairness of automatic decision making procedures that can be then adopted in the society.
De-Arteaga et al. also note the possibility of using decision in the data to correct for selective labels assuming expert consistency. They directly impute decisions as outcomes and consider learning automatic decision makers~\cite{dearteaga2018learning}. In contrast, our approach on decision maker evaluation is based on a rigorous probabilistic model accounting for different leniencies and unobservables. Furthermore, our approach gives accurate results even with random decision makers that clearly violate the expert consistency assumption. \acomment{We should refer to Deartega somewhere early on, they have made the same discovery as we put presented it poorly.}
\subsection{Counterfactuals}
Recent research has shown the value of counterfactual reasoning in similar setting as this paper, for fairness of decision making, and applications in online advertising~\cite{DBLP:journals/jmlr/BottouPCCCPRSS13,DBLP:conf/icml/Kusner0LS19,DBLP:conf/icml/NabiMS19,DBLP:conf/icml/JohanssonSS16,pearl2000}.
\subsection{Imputation}
\subsection{Older}
%
%
Although contraction is computationally very simple/efficient and estimates the true failure rate well, it has some limitations.
Although contraction is computationally very simple/efficient and estimates the true failure rate well, it has some limitations.
...
@@ -29,7 +44,7 @@ The disadvantage of that is that it may delay the presentation of the main contr
...
@@ -29,7 +44,7 @@ The disadvantage of that is that it may delay the presentation of the main contr
On the other hand, we should make sure that competing methods like \citet{lakkaraju2017selective} are sufficiently described before the appear in experiments.
On the other hand, we should make sure that competing methods like \citet{lakkaraju2017selective} are sufficiently described before the appear in experiments.
}
}
Discuss this: \cite{DBLP:conf/icml/Kusner0LS19}
Discuss this:
\begin{itemize}
\begin{itemize}
\item Lakkaraju and contraction. \cite{lakkaraju2017selective}
\item Lakkaraju and contraction. \cite{lakkaraju2017selective}