Skip to content
Snippets Groups Projects
Commit cc100048 authored by Michael Mathioudakis's avatar Michael Mathioudakis
Browse files

Make pass over abstract

parent a8a41dc1
No related branches found
No related tags found
No related merge requests found
......@@ -62,19 +62,18 @@
\begin{abstract}
As an increasing number of decisions affecting people's lives are made by AI systems, automating the evaluation of such systems becomes increasingly important.
Today, AI systems replace humans in an increasing number of decisions affecting people's lives.
%
One major challenge for evaluation is that often decisions skew the data on which the evaluation is performed.
Therefore, it is important to evaluate the performance of such systems {\it offline}, i.e., before they are deployed in real settings --
and compare it to the performance of human decisions they aim to replace.
%
% For example, when deciding whether a defendant should be granted bail or rather be led to jail, a decision is deemed successful if it grants bail to defendants who would honor the conditions of the bail and leads to jail ones who would violate them.
One major challenge in such cases is that often past decisions have skewed the data on which the evaluation is performed.
%
% However, in such cases, we are only able to directly evaluate the mechanism when it grants bail, while we cannot observe the potential bail violations by defendants who were led to jail.
For example, when a bank decides whether a customer should be granted a loan, it is desired to grant loans to customers who would honor its conditions, but not to ones who would violate them.
%
For example, when a bank decides whether a customer should be granted a loan or not, a decision is deemed successful if it grants a loan to a customer who would honor its conditions, but not to one who would violate them.
%
However, in such cases, we are only able to directly evaluate the decision to grant the loan, while we cannot observe whether customers who were not granted the loan would indeed violate its conditions.
%
To evaluate the decision not to grant the loan, one approach is to infer the outcome in the hypothetical case that the loan were granted.
However, we can directly evaluate only the decision to grant the loan, while we cannot observe whether customers who were not granted the loan would indeed violate its conditions.
%
Such skew appears in the decisions of both human and AI decision makers -- and should be properly taken into account for evaluation.
%
In this paper, we develop a Bayesian approach towards this end that uses counterfactual-based imputation to infer unobserved outcomes.
%
......@@ -82,7 +81,7 @@ Compared to previous state-of-the-art, the quality of decisions is estimated mor
%
The approach is also shown to be robust to different variations in the decision mechanisms in the data.
%
\mcomment{On one hand, since we use judicial data in our experiments, it makes sense to use the bail-or-jail case in the abstract. On the other hand, this does not connect with the motivation we provide to evaluate the decision of (computer/ML/AI) systems, since jail-or-bail decisions are not currently made by such systems. The bank loan example might look better in the abstract.}
\mcomment{On one hand, since we use judicial data in our experiments, it makes sense to use the bail-or-jail case in the abstract. On the other hand, this does not connect with the motivation we provide to evaluate the decision of (computer/ML/AI) systems, since jail-or-bail decisions are not currently made by such systems (risk scores are used as assisting tools). The bank loan example might look better in the abstract.}
%
\end{abstract}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment