@@ -220,9 +220,9 @@ Figure~\ref{fig:results_rmax05} shows the evaluation over leniencies similarly a
%
Contraction is only able to estimate the failure rate up to $0.5$, for higher leniency rates it does not output any results.
%
Our method (counterfactuals) can produce failure rate estimates for all leniencies, although the accuracy of failure rate estimates for the largest leniencies are lower than with unlimited leniency.
Our method (counterfactuals) can produce failure rate estimates for all leniencies (although the accuracy of failure rate estimates for the largest leniencies are lower than with unlimited leniency).
%
This result is important in the sense that decision makers based on advanced machine learning techniques may well allow for the use higher leniency rates than those of the judges (human) employed in the data.
This observation is vitally important in the sense that decision makers based on advanced machine learning techniques may well allow for the use higher leniency rates than those (often human) employed in the data.
Thus overall, in these synthetic settings our method achieves more accurate results with considerably less variation than the state-of-the-art contraction, allowing for evaluation in situations where the strong assumptions of contraction inhibit evaluation altogether.