Skip to content
Snippets Groups Projects
Commit a878ca58 authored by Riku-Laine's avatar Riku-Laine
Browse files

fw minor tweak

parent fd96922f
No related branches found
No related tags found
No related merge requests found
......@@ -118,13 +118,13 @@ On the formalisation of R: We discussed how Lakkaraju's paper treats variable R
\section{Framework definition -- 13 June discussion}
First, data is generated through a \textbf{data generating process (DGP)}. DGP comprises of generating the private features and the acceptance rates for the judges. \textbf{Acceptance rate (AR)} is defined as the ratio of positive decisions to all decisions. As a formula \[ AR = \dfrac{\#\{Positive~decisions\}}{\#\{Decisions\}}. \] Data generation process is depicted in the first box of Figure \ref{fig:separation}.
First, data is generated through a \textbf{data generating process (DGP)}. DGP comprises of generating the private features for the subjects, generating the acceptance rates for the judges and assigning the subjects to the judges. \textbf{Acceptance rate (AR)} is defined as the ratio of positive decisions to all decisions that a judge will give. As a formula \[ AR = \dfrac{\#\{Positive~decisions\}}{\#\{Decisions\}}. \] Data generation process is depicted in the first box of Figure \ref{fig:separation}.
Next, the generated data goes to the \textbf{labeling process}. In the labeling process, it is determined which instances of the data will have an outcome label available. This is done by humans and is presented in lines 5--7 of algorithm \ref{alg:data_without_Z} and 5--8 of algorithm \ref{alg:data_with_Z}. % Tämä nähdään dataa generoivissa algoritmeissa riveillä X-Y ja A-B.
Next, the generated data goes to the \textbf{labeling process}. In the labeling process, it is determined which instances of the data will have an outcome label available. This is done by humans and is presented in lines 5--7 of algorithm \ref{alg:data_without_Z} and 5--8 of algorithm \ref{alg:data_with_Z}.
In the third step, the labeled data is given to a machine that will either make decisions or predictions using some features of the data. The machine will output either binary decisions (yes/no), probabilities (a real number in interval $[0, 1]$) or a metric for ordering all the instances. The machine will be denoted with $\M$.
Finally the decisions and/or predictions made by the machine $\M$ and human judges (see dashed arrow in figure \ref{fig:separation}) will be evaluated using an \textbf{evaluation algorithm}. Evaluation algorithms will take the decisions, probabilities or ordering generated in the previous steps as input and then output an estimate of the failure rate. \textbf{Failure rate (FR)} is defined as the ratio of the number undesired outcomes to the number of decisions made. One special characteristic of FR in this setting is that a failure can only occur with a positive decision. More explicitly \[ FR = \dfrac{\#\{Failures\}}{\#\{Decisions\}}. \] Second characteristic of FR is that the number of positive decisions and therefore FR itself can be controlled through acceptance rate defined above.
Finally the decisions and/or predictions made by the machine $\M$ and human judges (see dashed arrow in figure \ref{fig:separation}) will be evaluated using an \textbf{evaluation algorithm}. Evaluation algorithms will take the decisions, probabilities or ordering generated in the previous steps as input and then output an estimate of the failure rate. \textbf{Failure rate (FR)} is defined as the ratio of undesired outcomes to given decisions. One special characteristic of FR in this setting is that a failure can only occur with a positive decision. More explicitly \[ FR = \dfrac{\#\{Failures\}}{\#\{Decisions\}}. \] Second characteristic of FR is that the number of positive decisions and therefore FR itself can be controlled through acceptance rate defined above.
Given the above framework, the goal is to create an evaluation algorithm that can accurately estimate the failure rate of any model $\M$ if it were to replace human decision makers in the labeling process. The estimations have to be made using only data that human decision-makers have labeled. The failure rate has to be accurately estimated for various levels of acceptance rate. The accuracy of the estimates can be compared by computing e.g. mean absolute error w.r.t the estimates given by \nameref{alg:true_eval} algorithm.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment