Skip to content
Snippets Groups Projects
Commit fef6a8d9 authored by Riku-Laine's avatar Riku-Laine
Browse files

Data flow explicated

parent d1be93e6
No related branches found
No related tags found
No related merge requests found
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
\usepackage{pgf} \usepackage{pgf}
\usepackage{tikz} \usepackage{tikz}
\usetikzlibrary{arrows,automata} \usetikzlibrary{arrows,automata, positioning}
\usepackage{algorithm}% http://ctan.org/pkg/algorithms \usepackage{algorithm}% http://ctan.org/pkg/algorithms
\usepackage{algorithmic}% http://ctan.org/pkg/algorithms \usepackage{algorithmic}% http://ctan.org/pkg/algorithms
...@@ -133,22 +133,23 @@ Counterfactual inference. Counterfactual inference techniques have been used ext ...@@ -133,22 +133,23 @@ Counterfactual inference. Counterfactual inference techniques have been used ext
\section{Framework definition -- 13 June discussion} \label{sec:framework} \section{Framework definition -- 13 June discussion} \label{sec:framework}
First, data is generated through a \textbf{data generating process (DGP)}. DGP comprises of generating the private features for the subjects, generating the acceptance rates for the judges and assigning the subjects to the judges. \textbf{Acceptance rate (AR)} is defined as the ratio of positive decisions to all decisions that a judge will give. As a formula \[ AR = \dfrac{\#\{Positive~decisions\}}{\#\{Decisions\}}. \] Data generation process is depicted in the first box of Figure \ref{fig:framework}. \emph{In this section we define some key terms and concepts and derive a more unambiguous framework for the selective labels problem. The framework is presented in writing and as a picture in figures \ref{fig:framework} and \ref{fig:framework_data_flow}.}
Next, the generated data goes to the \textbf{labeling process}. In the labeling process, it is determined which instances of the data will have an outcome label available. This is done by humans and is presented in lines 5--7 of algorithm \ref{alg:data_without_Z} and 5--8 of algorithm \ref{alg:data_with_Z}. First, data is generated through a \textbf{data generating process (DGP)}. DGP comprises of generating the private features for the subjects, generating the acceptance rates for the judges and assigning the subjects to the judges. \textbf{Acceptance rate (AR)} is defined as the ratio of positive decisions to all decisions that a judge will give. As a formula \[ AR = \dfrac{\#\{Positive~decisions\}}{\#\{Decisions\}}. \] Data generating process is depicted in the first box of Figure \ref{fig:framework}.
In the third step, the labeled data is given to a machine that will either make decisions or predictions using some features of the data. The machine will output either binary decisions (yes/no), probabilities (a real number in interval $[0, 1]$) or a metric for ordering all the instances. The machine will be denoted with $\M$. Next, the all of the generated data goes to the \textbf{labeling process}. In the labeling process, it is determined which instances of the data will have an outcome label available. This is done by humans and is presented in lines 5--7 of algorithm \ref{alg:data_without_Z} and 5--8 of algorithm \ref{alg:data_with_Z}. The data is then split randomly into training and test datasets, $\D_{train}$ and $\D_{test}$ respectively.
In the third step, the labeled data is given to a machine that will either make decisions or predictions using some features of the data. The machine will be trained on the training data set. Then, the machine will output either binary decisions (yes/no), probabilities (a real number in interval $[0, 1]$) or a metric for ordering for all the instances in the test data set. The machine will be denoted with $\M$.
Finally the decisions and/or predictions made by the machine $\M$ and human judges (see dashed arrow in figure \ref{fig:framework}) will be evaluated using an \textbf{evaluation algorithm}. Evaluation algorithms will take the decisions, probabilities or ordering generated in the previous steps as input and then output an estimate of the failure rate. \textbf{Failure rate (FR)} is defined as the ratio of undesired outcomes to given decisions. One special characteristic of FR in this setting is that a failure can only occur with a positive decision. More explicitly \[ FR = \dfrac{\#\{Failures\}}{\#\{Decisions\}}. \] Second characteristic of FR is that the number of positive decisions and therefore FR itself can be controlled through acceptance rate defined above. Finally the decisions and/or predictions made by the machine $\M$ and human judges (see dashed arrow in figure \ref{fig:framework}) will be evaluated using an \textbf{evaluation algorithm}. Evaluation algorithms will take the decisions, probabilities or ordering generated in the previous steps as input and then output an estimate of the failure rate. \textbf{Failure rate (FR)} is defined as the ratio of undesired outcomes to given decisions. One special characteristic of FR in this setting is that a failure can only occur with a positive decision. More explicitly \[ FR = \dfrac{\#\{Failures\}}{\#\{Decisions\}}. \] Second characteristic of FR is that the number of positive decisions and therefore FR itself can be controlled through acceptance rate defined above.
Given the above framework, the goal is to create an evaluation algorithm that can accurately estimate the failure rate of any model $\M$ if it were to replace human decision makers in the labeling process. The estimations have to be made using only data that human decision-makers have labeled. The failure rate has to be accurately estimated for various levels of acceptance rate. The accuracy of the estimates can be compared by computing e.g. mean absolute error w.r.t the estimates given by \nameref{alg:true_eval} algorithm. Given the above framework, the goal is to create an evaluation algorithm that can accurately estimate the failure rate of any model $\M$ if it were to replace human decision makers in the labeling process. The estimations have to be made using only data that human decision-makers have labeled. The failure rate has to be accurately estimated for various levels of acceptance rate. The accuracy of the estimates can be compared by computing e.g. mean absolute error w.r.t the estimates given by \nameref{alg:true_eval} algorithm.
\begin{figure} [H] \begin{figure} [H]
\centering \centering
\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=1.5cm, \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=1.5cm,
semithick] semithick]
\tikzstyle{every state}=[fill=none,draw=black,text=black, rectangle, minimum width=6cm] \tikzstyle{every state}=[fill=none,draw=black,text=black, rectangle, minimum width=7.0cm]
\node[state] (D) {Data generation}; \node[state] (D) {Data generation};
\node[state] (J) [below of=D] {Labeling process (human)}; \node[state] (J) [below of=D] {Labeling process (human)};
...@@ -157,13 +158,36 @@ Given the above framework, the goal is to create an evaluation algorithm that ca ...@@ -157,13 +158,36 @@ Given the above framework, the goal is to create an evaluation algorithm that ca
\path (D) edge (J) \path (D) edge (J)
(J) edge (MP) (J) edge (MP)
edge [bend right=81, dashed] (EA) edge [bend right=82, dashed] (EA)
(MP) edge (EA); (MP) edge (EA);
\end{tikzpicture} \end{tikzpicture}
\caption{The selective labels framework. The dashed arrow indicates how human evaluations are evaluated without machine intervention using \nameref{alg:human_eval} algorithm.} \caption{The selective labels framework. The dashed arrow indicates how human evaluations are evaluated without machine intervention using \nameref{alg:human_eval} algorithm.}
\label{fig:framework} \label{fig:framework}
\end{figure} \end{figure}
\begin{figure} [H]
\centering
\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=1.5cm,
semithick]
\tikzstyle{every state}=[fill=none,draw=black,text=black, rectangle, minimum width=7.0cm]
\node[state] (DG) {Data generation};
\node[state] (LP) [below of = DG] {Labeling process (human)};
\node[state] (MT) [below left=1.0cm and -4cm of LP] {Model training};
\node[state] (MD) [below=1.0cm of MT] {$\mathcal{M}$ Machine decisions / predictions};
\node[state] (EA) [below right=0.75cm and -4cm of MD] {Evaluation algorithm};
\path (DG) edge (LP)
(LP) edge [bend left=-15] node [right, pos=0.6] {$\D_{train}$} (MT)
edge [bend left=45] node [right] {$\D_{test}$} (MD)
edge [bend left=70, dashed] node [right] {$\D_{test}$} (EA)
(MT) edge node {$\M$} (MD)
(MD) edge (EA);
\end{tikzpicture}
\caption{The selective labels framework with explicit data flow. The dashed arrow indicates how human evaluations are evaluated without machine intervention using \nameref{alg:human_eval} algorithm. The evaluations are performed over the test set.}
\label{fig:framework_data_flow}
\end{figure}
\section{Data generation} \section{Data generation}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment