Monte carlo results

e7bbbbbf · Riku-Laine · c11bb7a8 · e7bbbbbf · e7bbbbbf · e7bbbbbf
Commit e7bbbbbf authored 5 years ago by Riku-Laine
--- a/analysis_and_scripts/notes.tex
+++ b/analysis_and_scripts/notes.tex
@@ -87,7 +87,7 @@
 \graphicspath{ {../figures/} }

 \title{Notes}
-\author{RL, 25 June 2019}
+\author{RL, 1 July 2019}
 %\date{}                                           % Activate to display a given date or no date

 \begin{document}
@@ -235,7 +235,7 @@ Given the above framework, the goal is to create an evaluation algorithm that ca
 \label{fig:framework_data_flow}
 \end{figure}

-\section{Modular framework -- based on 19 June discussion}
+\section{Modular framework -- based on 19 June discussion} \label{sec:modular_framework}

 \begin{wrapfigure}{r}{0.25\textwidth} %this figure will be at the right
    \centering
@@ -529,7 +529,7 @@ Causal model, ep 	& 0.000598624 	& 0.0411532\\
 \end{table}


-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \begin{subfigure}[b]{0.5\textwidth}
        \includegraphics[width=\textwidth]{sl_without_Z_8iter}
@@ -553,7 +553,7 @@ If we assign $\beta_Z=0$, almost all failure rates drop to zero in the interval

 The disparities between figures \ref{fig:results_without_Z} and \ref{fig:betaZ_0} (result without unobservables and with $\beta_Z=0$) can be explained in the slight difference in the data generating process, namely the effect of $\epsilon$. The effect of adding $\epsilon$ (noise to the decisions) is further explored in section \ref{sec:epsilon}.

-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \begin{subfigure}[b]{0.475\textwidth}
        \includegraphics[width=\textwidth]{sl_with_Z_4iter_betaZ_1_5}
@@ -575,7 +575,7 @@ The disparities between figures \ref{fig:results_without_Z} and \ref{fig:betaZ_0

 In this part, Gaussian noise with zero mean and 0.1 variance was added to the probabilities $P(Y=0|X=x)$ after sampling Y but before ordering the observations in line 5 of algorithm \ref{alg:data_without_Z}. Results are presented in Figure \ref{fig:sigma_figure}.

-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \includegraphics[width=0.5\textwidth]{sl_without_Z_3iter_sigma_sqrt_01}
    \caption{Failure rate with varying levels of leniency without unobservables. Noise has been added to the decision probabilities. Logistic regression was trained on labeled training data with $N_{iter}$ set to 3.}
@@ -586,7 +586,7 @@ In this part, Gaussian noise with zero mean and 0.1 variance was added to the pr

 In this section the predictive model was switched to random forest classifier to examine the effect of changing the predictive model. Results are practically identical to those presented in figure \ref{fig:results} previously and are presented in figure \ref{fig:random_forest}.

-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \begin{subfigure}[b]{0.475\textwidth}
        \includegraphics[width=\textwidth]{sl_withoutZ_4iter_randomforest}
@@ -608,7 +608,7 @@ In this section the predictive model was switched to random forest classifier to

 Predictions were checked by drawing a graph of predicted Y versus X, results are presented in figure \ref{fig:sanity_check}. The figure indicates that the predicted class labels and the probabilities for them are consistent with the ground truth.

-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \includegraphics[width=0.5\textwidth]{sanity_check}
    \caption{Predicted class label and probability of $Y=1$ versus X. Prediction was done with a logistic regression model. Colors of the points denote ground truth (yellow = 1, purple = 0). Data set was created with the unobservables.}
@@ -619,7 +619,7 @@ Predictions were checked by drawing a graph of predicted Y versus X, results are

 Given our framework defined in section \ref{sec:framework}, the results presented next are with model $\M$ that outputs probabilities 0.5 for every instance of $x$. Labeling process is still as presented in algorithm \ref{alg:data_with_Z}.  

-\begin{figure}[H]
+\begin{figure}[]
    \centering
    \begin{subfigure}[b]{0.475\textwidth}
        \includegraphics[width=\textwidth]{sl_without_Z_15iter_random_model}
@@ -639,53 +639,57 @@ Given our framework defined in section \ref{sec:framework}, the results presente

 \subsection{Modular framework -- Monte Carlo evaluator} \label{sec:modules_mc}

-For these results, data was generated with module in algorithm \ref{alg:dg:coinflip_with_z} ("coin-flip results") and decisions were assigned using module in algorithm \ref{alg:decider:quantile}. Curves were computed with algorithms \ref{alg:eval:true_eval}, \ref{alg:eval:labeled_outcomes}, \ref{alg:eval:human_eval}, \ref{alg:eval:contraction} and \ref{alg:eval:mc} are presented in figure \ref{fig:modules_mc}. The corresponding MAEs are presented in table \ref{tab:modules_mc}.
+For these results, data was generated either with module in algorithm \ref{alg:dg:coinflip_with_z} (drawing Y from Bernoulli distribution with parameter $\pr(Y=0|X, Z, W)$ as previously) or with module in algorithm \ref{alg:dg:threshold_with_Z} (assign Y based on the value of $\sigma(\beta_XX+\beta_ZZ)$). Decisions were determined using one of the two modules: module in algorithm \ref{alg:decider:quantile} (decision based on quantiles) or \ref{alg:decider:human} ("human" decision-maker as in \cite{lakkaraju17}). Curves were computed with True evaluation (algorithm \ref{alg:eval:true_eval}), Labeled outcomes (\ref{alg:eval:labeled_outcomes}), Human evaluation (\ref{alg:eval:human_eval}), Contraction (\ref{alg:eval:contraction}) and Monte Carlo evaluators (\ref{alg:eval:mc}). Results are presented in figure \ref{fig:modules_mc}. The corresponding MAEs are presented in table \ref{tab:modules_mc}.
+
+From the result table we can see that the MAE is at the lowest when the data generating process corresponds closely to the Monte Carlo algorithm.

 \begin{table}[H]
 \centering
-\caption{Mean absolute error (MAE) w.r.t true evaluation. See modules used in section \ref{sec:modules_mc}}
-\begin{tabular}{l | c c}
-Method & MAE with Z \\ \hline
-Labeled outcomes 	& 0.111075\\
-Human evaluation 	& 0.027298\\
-Contraction 		& 0.004206\\
-Monte Carlo	 	& 0.001292\\
+\caption{Mean absolute error w.r.t true evaluation. See modules used in section \ref{sec:modules_mc}. Bern = Bernoulli,  indep. = independent, TH = threshold}
+\begin{tabular}{l | c c c c}
+Method & Bern + indep. & Bern + non-indep. & TH + indep. & TH + non-indep.\\ \hline
+Labeled outcomes 	& 0.111075	& 0.103235	& 0.108506 &\\
+Human evaluation 	& 0.027298	& NaN (TBA)	& 0.049582 &\\
+Contraction 		& 0.004206	& 0.004656	& 0.005557 &\\
+Monte Carlo	 	& 0.001292	& 0.016629	& 0.009429 &\\
 \end{tabular}
 \label{tab:modules_mc}
 \end{table}

+
 \begin{figure}[H]
    \centering
-    \includegraphics[width=0.75\textwidth]{sl_with_Z_10iter_coinflip_quantile_defaults_mc}
-    \caption{Failure rate vs. acceptance rate with varying levels of leniency. Data was generated with unobservables. See modules used in section \ref{sec:modules_mc}}
+    \begin{subfigure}[b]{0.475\textwidth}
+        \includegraphics[width=\textwidth]{sl_with_Z_10iter_coinflip_quantile_defaults_mc}
+        \caption{Outcome Y from Bernoulli, independent decisions using the quantiles and $N_{iter}=10$.}
+        %\label{fig:modules_mc_without_Z}
+    \end{subfigure}
+    \quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc. 
+      %(or a blank line to force the subfigure onto a new line)
+    \begin{subfigure}[b]{0.475\textwidth}
+        \includegraphics[width=\textwidth]{sl_with_Z_20iter_threshold_quantile_defaults_mc}
+        \caption{Outcome Y from threshold rule, independent decisions using the quantiles and $N_{iter}=20$.}
+        %\label{fig:modules_mc_with_Z}
+    \end{subfigure}
+    \begin{subfigure}[b]{0.475\textwidth}
+        \includegraphics[width=\textwidth]{sl_with_Z_10iter_coinflip_lakkarajudecider_defaults_mc}
+        \caption{Outcome Y from Bernoulli, non-independent decisions and $N_{iter}=10$.}
+        %\label{fig:modules_mc_without_Z}
+    \end{subfigure}
+    \quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc. 
+      %(or a blank line to force the subfigure onto a new line)
+    \begin{subfigure}[b]{0.475\textwidth}
+        \includegraphics[width=\textwidth]{sl_with_Z_4iter_threshold_lakkarajudecider_defaults_mc}
+        \caption{Outcome Y from threshold rule, non-independent decisions and $N_{iter}=4$.}
+        %\label{fig:modules_mc_with_Z}
+    \end{subfigure}
+    \caption{Failure rate vs. acceptance rate with varying levels of leniency. Different combinations of deciders and data generation modules. See other modules used in section \ref{sec:modules_mc}}
    \label{fig:modules_mc}
 \end{figure}

-%\begin{figure}[H]
-%    \centering
-%    \begin{subfigure}[b]{0.475\textwidth}
-%        \includegraphics[width=\textwidth]{sl_without_Z_10iter_coinflip_quantile_defaults_mc}
-%        \caption{Data without unobservables. PLACEHOLDER}
-%        \label{fig:modules_mc_without_Z}
-%    \end{subfigure}
-%    \quad %add desired spacing between images, e. g. ~, \quad, \qquad, \hfill etc. 
-%      %(or a blank line to force the subfigure onto a new line)
-%    \begin{subfigure}[b]{0.475\textwidth}
-%        \includegraphics[width=\textwidth]{sl_with_Z_10iter_coinflip_quantile_defaults_mc}
-%        \caption{Data with unobservables.}
-%        \label{fig:modules_mc_with_Z}
-%    \end{subfigure}
-%    \caption{Failure rate vs. acceptance rate with varying levels of leniency. See modules used in section \ref{sec:modules_mc}}
-%    \label{fig:modules_mc}
-%\end{figure}z
-
 \section{Modules}

-Different types of modules are presented in this section. Summary table is presented last.
-
-\begin{itemize}
-\item Data generation modules usually take only some generative parameters as input.
-\end{itemize}
+Different types of modules (data generation, decider and evaluator) are presented in this section. Summary table is presented last. See section \ref{sec:modular_framework} for a more thorough break-down on the properties of each module.

 \begin{algorithm}[] 			% enter the algorithm environment
 \caption{Data generation module: "coin-flip results" without unobservables} 		% give the algorithm a caption
@@ -695,7 +699,7 @@ Different types of modules are presented in this section. Summary table is prese
 \ENSURE
 \FORALL{$i$ in $1, \ldots, N_{total}$}
 	\STATE Draw $x_i$ from from a standard Gaussian.
-	\STATE Draw $y_i$ from Bernoulli$(1-\sigma(X))$.
+	\STATE Draw $y_i$ from Bernoulli$(1-\sigma(x_i))$.
 	\STATE Attach to data.
 \ENDFOR 
 \RETURN data
@@ -711,7 +715,11 @@ Different types of modules are presented in this section. Summary table is prese
 \ENSURE
 \FORALL{$i$ in $1, \ldots, N_{total}$}
 	\STATE Draw $x_i, z_i$ and $w_i$ from from standard Gaussians independently.
-	\STATE Set Y to 0 if $P(Y = 0| X, Z, W) = \sigma(\beta_XX+\beta_ZZ+\beta_WW) \geq 0.5$ and \\to 1 otherwise.
+	\IF{$\sigma(\beta_Xx_i+\beta_Zz_i+\beta_Ww_i) \geq 0.5$}
+		\STATE {Set $y_i$ to 0.}
+	\ELSE
+		\STATE {Set $y_i$ to 1.}
+	\ENDIF
 	\STATE Attach to data.
 \ENDFOR 
 \RETURN data
@@ -743,8 +751,8 @@ Different types of modules are presented in this section. Summary table is prese
 \ENSURE
 \STATE Sample acceptance rates for each M judges from Uniform$(0.1; 0.9)$ and round to tenth decimal place.
 \STATE Assign each observation to a judge at random.
-\STATE Calculate $P(T=0|X, Z) = \sigma(\beta_XX+\beta_ZZ) + \epsilon$ for each observation and attach to data.
-\STATE Sort the data by (1) the judges' and (2) by probabilities $P(T=0|X, Z)$ in descending order. 
+\STATE Calculate $\pr(T=0|X, Z) = \sigma(\beta_XX+\beta_ZZ) + \epsilon$ for each observation and attach to data.
+\STATE Sort the data by (1) the judges and (2) by the probabilities in descending order. 
 \STATE \hskip3.0em $\rhd$ Now the most dangerous subjects for each of the judges are at the top.
 \STATE If subject belongs to the top $(1-r) \cdot 100 \%$ of observations assigned to that judge, set $T=0$ else set $T=1$.
 \STATE Set $Y=$ NA if decision is negative ($T=0$). \emph{Might not be performed.}
@@ -759,7 +767,7 @@ Different types of modules are presented in this section. Summary table is prese
 \REQUIRE Data with features $X, Z$ of size $N_{total}$, knowledge that both of them affect the outcome Y and that they are independent / Parameters: $\beta_X=1, \beta_Z=1$.
 \ENSURE
 \FORALL{$i$ in $1, \ldots, N_{total}$}
-	\STATE Draw $t_i$ from Bernoulli$(\sigma(\beta_XX+\beta_ZZ)))$.
+	\STATE Draw $t_i$ from Bernoulli$(\sigma(\beta_Xx_i+\beta_Zz_i))$.
 	\STATE Attach to data.
 \ENDFOR 
 \STATE Set $Y=$ NA if decision is negative ($T=0$). \emph{Might not be performed.}
@@ -777,10 +785,10 @@ Different types of modules are presented in this section. Summary table is prese
 \STATE Assign each observation to a judge at random.
 \STATE Calculate $\pr(T=0|X, Z) = \sigma(\beta_XX+\beta_ZZ)$ for each observation and attach to data.
 \FORALL{$i$ in $1, \ldots, N_{total}$}
-	\IF{$\sigma(\beta_XX+\beta_ZZ) \geq F^{-1}_{\pr(T=0|X, Z)}(r)$ \footnotemark} % Footnote text below algorithm
-		\STATE {set $t_i=0$} 
+	\IF{$\sigma(\beta_Xx_i+\beta_Zz_i) \geq F^{-1}_{\pr(T=0|X, Z)}(r)$ \footnotemark} % Footnote text below algorithm
+		\STATE{Set $t_i=0$.} 
 	\ELSE 
-		\STATE{set $t_i=1$} 
+		\STATE{Set $t_i=1$.} 
 	\ENDIF
 	\STATE Attach to data.
 \ENDFOR 
@@ -893,11 +901,12 @@ Different types of modules are presented in this section. Summary table is prese
 \STATE Compute the values of the inverse cdf of the observations in \texttt{quants} for the acceptance rates r of each judge and assign to $Q_r$.
 \FORALL{$i$ in $1, \ldots, N_{test}$}
 	\IF{$t_i = 0$}
-		\STATE {Take all $Z > logit(Q_{r,i})-x_i$ \footnotemark} 
+		\STATE{Take all $Z + \epsilon > logit(Q_{r,i})-x_i$ , \footnotemark~where $\epsilon \sim N(0, 0.1)$.} 
 	\ELSE 
-		\STATE{Take all $Z < logit(Q_{r,i})-x_i$}
+		\STATE{Take all $Z + \epsilon < logit(Q_{r,i})-x_i$ , where $\epsilon \sim N(0, 0.1)$.} 
 	\ENDIF
-	\STATE Draw predictions $\hat{p}_{i,y}$ from Bernoulli($1-logit^{-1}(x_i+\bar{Z})$).
+	\STATE Compute $\bar{z}=\frac{1}{n}\sum z$
+	\STATE Draw predictions $\hat{p}_{i,y}$ from Bernoulli($1-logit^{-1}(x_i+\bar{z})$) and assign to data.
 \ENDFOR
 \STATE Impute missing observations using $\hat{p}_y$.
 \STATE Sort the data by the probabilities $\s$ to ascending order.
@@ -945,7 +954,8 @@ Different types of modules are presented in this section. Summary table is prese
     &  & {\ul Monte Carlo evaluator} \\
     &  & \tabitem Data $\D$ with properties $\{x_i, j_i, t_i, y_i\}$ \\
     &  & \tabitem acceptance rate r \\
-     &  & \tabitem knowledge that X affects Y \\[.5\normalbaselineskip]
+     &  & \tabitem knowledge that X affects Y \\
+     &  & \tabitem more intricate knowledge about $\M$ ? \\[.5\normalbaselineskip]
    \bottomrule
  \end{tabular}
  \label{tab:modules}

--- a/figures/sl_with_Z_10iter_coinflip_lakkarajudecider_defaults_mc.png
+++ b/figures/sl_with_Z_10iter_coinflip_lakkarajudecider_defaults_mc.png
--- a/figures/sl_with_Z_20iter_threshold_quantile_defaults_mc.png
+++ b/figures/sl_with_Z_20iter_threshold_quantile_defaults_mc.png
--- a/figures/sl_with_Z_4iter_coinflip_lakkarajudecider_defaults_mc.png
+++ b/figures/sl_with_Z_4iter_coinflip_lakkarajudecider_defaults_mc.png
--- a/figures/sl_with_Z_4iter_threshold_lakkarajudecider_defaults_mc.png
+++ b/figures/sl_with_Z_4iter_threshold_lakkarajudecider_defaults_mc.png