Skip to content
Snippets Groups Projects
notes.tex 70.3 KiB
Newer Older
\documentclass[11pt,a4paper]{amsart}
\usepackage[margin=1in]{geometry}                % See geometry.pdf to learn the layout options. There are lots.
%\geometry{a4paper}                   % ... or letterpaper or a5paper or ... 
Riku-Laine's avatar
Riku-Laine committed
%\geometry{landscape}                % Activate for for rotated page geometry 
%\usepackage[parfill]{parskip}    % Activate to begin paragraphs with an empty line rather than an indent
Riku-Laine's avatar
Riku-Laine committed
\usepackage{graphicx}
\usepackage{amssymb}
\usepackage{epstopdf}
Riku-Laine's avatar
Riku-Laine committed
\usepackage[hidelinks, colorlinks=true]{hyperref}
%\DeclareGraphicsRule{.tif}{png}{.png}{`convert #1 `dirname #1`/`basename #1 .tif`.png}
Riku-Laine's avatar
Riku-Laine committed

Riku-Laine's avatar
Riku-Laine committed
\usepackage[normalem]{ulem} %tables
\useunder{\uline}{\ul}{}

\usepackage{wrapfig} % wrap figures

Riku-Laine's avatar
Riku-Laine committed
\usepackage{booktabs}% http://ctan.org/pkg/booktabs
\newcommand{\tabitem}{~~\llap{\textbullet}~~}

Riku-Laine's avatar
Riku-Laine committed
\usepackage{pgf}
\usepackage{tikz}
\usepackage{tikz-cd}
Riku-Laine's avatar
Riku-Laine committed
\usetikzlibrary{arrows,automata, positioning}
Riku-Laine's avatar
Riku-Laine committed
\usepackage{algorithm}% http://ctan.org/pkg/algorithms
\usepackage{algorithmic}% http://ctan.org/pkg/algorithms
\renewcommand{\algorithmicrequire}{\textbf{Input:}}
\renewcommand{\algorithmicensure}{\textbf{Procedure:}}
\renewcommand{\algorithmicreturn}{\textbf{Return}}
Riku-Laine's avatar
Riku-Laine committed

\newcommand{\pr}{\mathbb{P}} % tn merkki
\newcommand{\D}{\mathcal{D}} % aineisto
\newcommand{\s}{\mathcal{S}} % "fancy S"
\newcommand{\M}{\mathcal{M}} % "fancy M"
\newcommand{\B}{\mathcal{B}} % "fancy B"
\newcommand{\RR}{\mathcal{R}} % supistusalgon R

\renewcommand{\descriptionlabel}[1]{\hspace{\labelsep}\textnormal{#1}}

Riku-Laine's avatar
Riku-Laine committed
\makeatletter
%Table of Contents
\setcounter{tocdepth}{3}

% Add bold to \section titles in ToC and remove . after numbers
\renewcommand{\tocsection}[3]{%
  \indentlabel{\@ifnotempty{#2}{\bfseries\ignorespaces#1 #2\quad}}\bfseries#3}
% Remove . after numbers in \subsection
\renewcommand{\tocsubsection}[3]{%
  \indentlabel{\@ifnotempty{#2}{\ignorespaces#1 #2\quad}}#3}
%\let\tocsubsubsection\tocsubsection% Update for \subsubsection
%...

\newcommand\@dotsep{4.5}
\def\@tocline#1#2#3#4#5#6#7{\relax
  \ifnum #1>\c@tocdepth % then omit
  \else
    \par \addpenalty\@secpenalty\addvspace{#2}%
    \begingroup \hyphenpenalty\@M
    \@ifempty{#4}{%
      \@tempdima\csname r@tocindent\number#1\endcsname\relax
    }{%
      \@tempdima#4\relax
    }%
    \parindent\z@ \leftskip#3\relax \advance\leftskip\@tempdima\relax
    \rightskip\@pnumwidth plus1em \parfillskip-\@pnumwidth
    #5\leavevmode\hskip-\@tempdima{#6}\nobreak
    \leaders\hbox{$\m@th\mkern \@dotsep mu\hbox{.}\mkern \@dotsep mu$}\hfill
    \nobreak
    \hbox to\@pnumwidth{\@tocpagenum{\ifnum#1=1\bfseries\fi#7}}\par% <-- \bfseries for \section page
    \nobreak
    \endgroup
  \fi}
\AtBeginDocument{%
\expandafter\renewcommand\csname r@tocindent0\endcsname{0pt}
}
\def\l@subsection{\@tocline{2}{0pt}{2.5pc}{5pc}{}}
\makeatother

Riku-Laine's avatar
Riku-Laine committed
%\makeatletter
%\def\listofalgorithms{\@starttoc{loa}\listalgorithmname}
%\def\l@algorithm{\@tocline{0}{3pt plus2pt}{0pt}{1.9em}{}}
%\renewcommand{\ALG@name}{AlGoRiThM}
%\renewcommand{\listalgorithmname}{List of \ALG@name s}
%\makeatother

\usepackage{subcaption}
\graphicspath{ {../figures/} }

Riku-Laine's avatar
Riku-Laine committed
\title{Notes}
Riku-Laine's avatar
Riku-Laine committed
\author{RL, 1 July 2019}
Riku-Laine's avatar
Riku-Laine committed
%\date{}                                           % Activate to display a given date or no date
Riku-Laine's avatar
Riku-Laine committed
\begin{document}

\maketitle

%\section*{Contents}
Riku-Laine's avatar
Riku-Laine committed
\tableofcontents

Riku-Laine's avatar
Riku-Laine committed
%\listofalgorithms

Riku-Laine's avatar
Riku-Laine committed
\begin{abstract}
This document presents the implementations of RL in pseudocode level. First, I present most of the nomenclature used in these notes. Then I proceed to give my personal views and comments on the motivation behind Selective labels paper. In chapter 2, I define the framework for this problem and give the required definitions. In the following sections, I present the data generating algorithms and algorithms for obtaining failure rates using different methods. Finally in the last section, I present results using multiple different settings.
Riku-Laine's avatar
Riku-Laine committed
\end{abstract}

\section*{Terms and abbreviations}

\begin{description}
Riku-Laine's avatar
Riku-Laine committed
\item[R :] acceptance rate, leniency of decision maker, $r \in [0, 1]$
\item[X :] personal features, observable to a predictive model
\item[Z :] some features of a subject, unobservable to a predictive model, latent variable
\item[W :] noise added to result variable Y
\item[T :] decision variable, bail/positive decision equal to 1, jail/negative decision equal to 0
\item[Y :] result variable, no crime/positive result equal to 1, crime/negative result equal to 0
\item[MAE :] Mean absolute error
\item[SL :] Selective labels, for more information see Lakkaraju's paper  \cite{lakkaraju17}
Riku-Laine's avatar
Riku-Laine committed
\item[Labeled data :] data that has been censored, i.e. if negative decision is given (T=0), then Y is set to NA.
\item[Full data :] data that has all labels available, i.e. \emph{even if} negative decision is given (T=0), Y will still be available.
\item[Unobservables :] unmeasured confounders, latent variables, Z
Riku-Laine's avatar
Riku-Laine committed
\end{description}

Mnemonic rule for the binary coding: zero bad (crime or jail), one good!
Riku-Laine's avatar
Riku-Laine committed

\section{RL's notes about the selective labels paper (optional reading)} \label{sec:comments}
Riku-Laine's avatar
Riku-Laine committed

Riku-Laine's avatar
Riku-Laine committed
\setcounter{figure}{-1}
\begin{wrapfigure}{r}{0.3\textwidth} %this figure will be at the right
    \centering
    \begin{tikzpicture}[->,>=stealth',node distance=1.5cm, semithick]

  \tikzstyle{every state}=[fill=none,draw=black,text=black]

  \node[state] (R)                    {$R$};
  \node[state] (X) [right of=R] {$X$};
  \node[state] (T) [below of=X] {$T$};
  \node[state] (Z) [rectangle, right of=X] {$Z$};
  \node[state] (Y) [below of=Z] {$Y$};

  \path (R) edge (T)
        (X) edge (T)
	     edge (Y)
        (Z) edge (T)
	     edge (Y)
        (T) edge (Y);
\end{tikzpicture}
\caption{Initial model.}
\label{fig:initial_model}
\end{wrapfigure}

Riku-Laine's avatar
Riku-Laine committed
\emph{This chapter is to present my comments and insight regarding the topic.}

The motivating idea behind the SL paper of Lakkaraju et al. \cite{lakkaraju17} is to evaluate if machines could improve on human performance. In general case, comparing the performance of human and machine evaluations is simple. In the domains addressed by Lakkaraju et al. simple comparisons would be unethical and therefore algorithms are required. (Other approaches, such as a data augmentation algorithm has been proposed by De-Arteaga \cite{dearteaga18}.)
Riku-Laine's avatar
Riku-Laine committed

The general idea of the SL paper is to train some predictive model with selectively labeled data. The question is then "how would this predictive model perform if it was to make independent bail-or-jail decisions?" That quantity can not be calculated from real-life data sets due to the ethical reasons and hidden labels. We can however use more selectively labeled data to estimate it's performance. But, because the available data is biased, the performance estimates are too good or "overly optimistic" if they are calculated in the conventional way ("labeled outcomes only"). This is why they are proposing the contraction algorithm.
Riku-Laine's avatar
Riku-Laine committed

One of the concepts to denote when reading the Lakkaraju paper is the difference between the global goal of prediction and the goal in this specific setting. The global goal is to have a low failure rate with high acceptance rate, but at the moment we are not interested in it. The goal in this setting is to estimate the true failure rate of the model with unseen biased data. That is, given only selectively labeled data and an arbitrary black-box model $\B$ we are interested in predicting the performance of model $\B$ in the whole data set with all ground truth labels.
Riku-Laine's avatar
Riku-Laine committed

On acceptance rate R: We discussed how Lakkaraju's paper treats variable R in a seemingly non-sensical way, it is as if a judge would have to let someone go today in order to detain some other defendant tomorrow to keep their acceptance rate at some $r$. A more intuitive way of thinking $r$ would be the "threshold perspective". That is, if a judge sees that a defendant has probability $p_x$ of committing a crime if let out, the judge would detain the defendant if $p_x > r$, the defendant would be too dangerous to let out. Now, it can be claimed that judges do have a threshold, even in Lakkaraju's setting. Let $c$ be a judge's acceptance threshold. Now decision $T$ can be defined with function $f_T(x, z)$ as follows:

\begin{equation}
  \pr(T=0|x, z)=\begin{cases}
    p_T, & \text{if $f_T(x, z) < c$}\\
    1-p_T, & \text{otherwise}.
  \end{cases}
\end{equation}

If $c$ is defined so that the ratio of positive decisions to all decisions will be equal to $r$, we will arrive at a similar data generation process as Lakkaraju  and as is presented in algorithm \ref{alg:data_with_Z}.
Finally, chapters from Lakkaraju \cite{lakkaraju17} about counterfactual inference, see references from their paper [sic]:
There has also been some work on inferring labels using counterfactual inference techniques [9, 12, 16, 36, 38] and leveraging these estimates when computing standard evaluation metrics. However, counterfactual inference techniques explicitly assume that there are no unmeasured confounders (that is, no unobservable variables Z) that could affect the outcome Y. This assumption does not typically hold in cases where human decisions are providing data labels [7, 21]. Thus, the combination of two ingredients -- selective labels and non-trivial unobservables -- poses problems for these existing techniques.

Counterfactual inference. Counterfactual inference techniques have been used extensively to estimate treatment effects in observational studies. These techniques have found applications in a variety of fields such as machine learning, epidemiology, and sociology [3, 8--10, 30, 34]. Along the lines of Johansson et al. [16], counterfactual inference techniques can be broadly categorized as: (1) parametric methods which model the relationship between observed features, treatments, and outcomes. Examples include any type of regression model such as linear and logistic regression, random forests and regression trees [12, 33, 42]. (2) non-parametric methods such as propensity score matching, nearest-neighbor matching, which do not explicitly model the relationship between observed features, treatments, and outcomes [4, 15, 35, 36, 41]. (3) doubly robust methods which combine the two aforementioned classes of techniques typically via a propensity score weighted regression [5, 10]. The effectiveness of parametric and non-parametric methods depends on the postulated regression model and the postulated propensity score model respectively. If the postulated models are not identical to the true models, then these techniques result in biased estimates of outcomes. Doubly robust methods require only one of the postulated models to be identical to the true model in order to generate unbiased estimates. However, due to the presence of unobservables, we cannot guarantee that either of the postulated models will be identical to the true models.
\section{Framework definition -- 13 June discussion} \label{sec:framework}
Riku-Laine's avatar
Riku-Laine committed
\emph{In this section we define some key terms and concepts and derive a more unambiguous framework for the selective labels problem. The framework is presented in writing and as a picture in figures \ref{fig:framework} and \ref{fig:framework_data_flow}.}
Riku-Laine's avatar
Riku-Laine committed
First, data is generated through a \textbf{data generating process (DGP)}. DGP comprises of generating the private features for the subjects, generating the acceptance rates for the judges and assigning the subjects to the judges. \textbf{Acceptance rate (AR)} is defined as the ratio of positive decisions to all decisions that a judge will give. As a formula \[ AR = \dfrac{\#\{Positive~decisions\}}{\#\{Decisions\}}. \]   Data generating process is depicted in the first box of Figure \ref{fig:framework}.
Riku-Laine's avatar
Riku-Laine committed
Next, the all of the generated data goes to the \textbf{labeling process}. In the labeling process, it is determined which instances of the data will have an outcome label available. This is done by humans and is presented in lines 5--7 of algorithm \ref{alg:data_without_Z} and 5--8 of algorithm \ref{alg:data_with_Z}. The data is then split randomly into training and test datasets, $\D_{train}$ and $\D_{test}$ respectively.

After labeling, the labeled training data is used to train a machine that will either make decisions or predictions using some features of the data. Then, the test data will be given to the machine and it will output either binary decisions (yes/no), probabilities (a real number in interval $[0, 1]$) or a metric for ordering for all the instances in the test data set. The machine will be denoted with $\M$.
Finally the decisions and/or predictions made by the machine $\M$ and human judges (see dashed arrow in figure \ref{fig:framework}) will be evaluated using an \textbf{evaluation algorithm}. Evaluation algorithms will take the decisions, probabilities or ordering generated in the previous steps as input and then output an estimate of the machine's failure rate in the test data. \textbf{Failure rate (FR)} is defined as the ratio of undesired outcomes to given decisions. One special characteristic of FR in this setting is that a failure can only occur with a positive decision. More explicitly \[ FR = \dfrac{\#\{Failures\}}{\#\{Decisions\}}. \] Second characteristic of FR is that the number of positive decisions and therefore FR itself can be controlled through acceptance rate defined above.
Given the above framework, the goal is to create an evaluation algorithm that can accurately estimate the failure rate of any model $\M$ if it were to replace human decision makers in the labeling process. The estimations have to be made using only data that human decision-makers have labeled. The failure rate has to be accurately estimated for various levels of acceptance rate. The accuracy of the estimates can be compared by computing e.g. mean absolute error w.r.t the estimates given by \nameref{alg:true_eval} algorithm.
Riku-Laine's avatar
Riku-Laine committed
\begin{figure} [H]
\centering
Riku-Laine's avatar
Riku-Laine committed
\begin{tikzpicture}[->, >=stealth, shorten >=1pt, auto, node distance=1.5cm, semithick]
Riku-Laine's avatar
Riku-Laine committed
  \tikzstyle{every state}=[fill=none, draw=black, text=black, rectangle, minimum width=6.0cm]
Riku-Laine's avatar
Riku-Laine committed
  \node[state] (DG) {Data generation};
  \node[state] (LP) [below of=DG] {Labeling process (human)};
  \node[state] (MP) [below of=LP] {Machine predictions};
  \node[state] (EA)  [below of=MP] {Evaluation algorithm};
Riku-Laine's avatar
Riku-Laine committed
  \path (DG) edge (LP)
                    edge [bend right=90, dashed] node [left] {(1)} (MP)
        (LP) edge (MP)
             edge [bend left=90, dashed] node {(2)} (EA)
        (MP) edge (EA);
Riku-Laine's avatar
Riku-Laine committed
\end{tikzpicture}
Riku-Laine's avatar
Riku-Laine committed
\caption{The selective labels framework. Dashed arrow (1) marks the flow of unlabeled data to \nameref{alg:true_eval} algorithm and (2) indicates how human evaluations are evaluated without machine intervention using \nameref{alg:human_eval} algorithm.}
\label{fig:framework}
Riku-Laine's avatar
Riku-Laine committed
\end{figure}

Riku-Laine's avatar
Riku-Laine committed
\begin{figure} [H]
\centering
\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=1.5cm,
                            semithick]
        
Riku-Laine's avatar
Riku-Laine committed
          \tikzstyle{every state}=[fill=none,draw=black,text=black, rectangle, minimum width=6.0cm]
Riku-Laine's avatar
Riku-Laine committed
        
          \node[state] (DG) {Data generation};
          \node[state] (LP) [below of = DG] {Labeling process (human)};
          \node[state] (MT) [below left=1.0cm and -4cm of LP] {Model training};
Riku-Laine's avatar
Riku-Laine committed
          \node[state] (MP) [below=1.0cm of MT] {Machine predictions};
          \node[state] (EA)  [below right=0.75cm and -4cm of MP] {Evaluation algorithm};
Riku-Laine's avatar
Riku-Laine committed
        
          \path (DG) edge (LP)
                            edge [out=180, in=180, dashed] node [left]  {$\D_{test,~unlabeled}$ (1)}   (MP)
Riku-Laine's avatar
Riku-Laine committed
                (LP) edge [bend right=19] node [left] {$\D_{train}$} (MT)
                      edge [bend left=60] node [right]  {$\D_{test}$} (MP)
                     edge [bend left=75, dashed] node [right]  {$\D_{test}$ (2)}  (EA)
                (MT) edge node {$\M$} (MP)
                (MP) edge (EA);
Riku-Laine's avatar
Riku-Laine committed
        \end{tikzpicture}
Riku-Laine's avatar
Riku-Laine committed
\caption{The selective labels framework with explicit data flow. Dashed arrow (1) marks the flow of unlabeled data to \nameref{alg:true_eval} algorithm and (2) indicates how human evaluations are evaluated without machine intervention using \nameref{alg:human_eval} algorithm.}
Riku-Laine's avatar
Riku-Laine committed
\label{fig:framework_data_flow}
\end{figure}
Riku-Laine's avatar
Riku-Laine committed
\section{Modular framework -- based on 19 June discussion} \label{sec:modular_framework}
\emph{Below is the framework as was written on the whiteboard, then RL presents his own remarks on how he understood this.}

\begin{wrapfigure}{r}{0.25\textwidth} %this figure will be at the right
    \centering
    \begin{tikzcd}
       \arrow[d]   &  \arrow[d]  &  \arrow[d]   \\
    X \arrow[rd] & Z \arrow[d] & W \arrow[ld] \\
                       & Y                 &             
    \end{tikzcd}
    \caption{$\M$}
Riku-Laine's avatar
Riku-Laine committed
    \label{fig:dgm}
\end{wrapfigure}

\begin{description}

Loading
Loading full blame...