diff --git a/paper/imputation.tex b/paper/imputation.tex index a362d03febe94fda322272911517b44b313bc0b9..311c88da2351d6a00d2423e55de27b01dbc19a56 100644 --- a/paper/imputation.tex +++ b/paper/imputation.tex @@ -71,14 +71,14 @@ Taking into account that we need to learn parameters from the data we integrate +% +% +%\subsection{Implementation} +% +%Stan allows us to directly sample from the posterior both of the parameters and the unobservable features. -\subsection{Implementation} - -Stan allows us to directly sample from the posterior both of the parameters and the unobservable features. - - -\subsection{Our approach} +\subsection{Overview of Our Approach} Having provided the intuition for our approach, in what follows we describe it in detail. % @@ -107,10 +107,13 @@ In particular, it provides information about the leniency and other parameters o Our approach for those cases unfolds over three steps: first, it learns a model over the dataset; then, it computes counterfactuals to predict unobserved outcomes; and finally, it uses predictions to evaluate a set of decisions. -%\subsection{Model} \label{sec:model_definition} +\subsection{The Causal Model} \label{sec:model_definition} %To make inference we obviously have to learn the parametric model from the data instead of fixed functions of the previous section. We can define the model as a probabilistic due to the simplification of the counterfactual expression in the previous section. -\spara{Model} The causal diagram of Figure~\ref{fig:causalmodel} provides the structure of causal relationships for quantities of interest. + +%\spara{Model} + + The causal diagram of Figure~\ref{fig:causalmodel} provides the structure of causal relationships for quantities of interest. We use the following causal model over the observed data, building on what is used by Lakkaraju et al.~\cite{lakkaraju2017selective}. We assume feature vectors $\obsFeaturesValue$ and $\unobservableValue$ representing risk can be consensed to unidimension risk factors, for example by propensity scores. Furthermore, we assume their distribution as Gaussian. Since $Z$ is unobserved we can assume its variance to be 1 without loss of generality, thus $\unobservable \sim N(0,1)$. %\begin{eqnarray*} %\unobservable &\sim& N(0,1). @@ -211,8 +214,9 @@ We use prior distributions given in Appendix~X % \prob{\parameters | \dataset} = \frac{\prob{\dataset | \parameters} \prob{\parameters}}{\prob{\dataset}} . %\end{equation} +\subsection{Computing Counterfactuals} -\spara{Computing counterfactuals} +%\spara{Computing counterfactuals} For the model defined above, the counterfactual $\hat{Y}$ can be computed by the approach of Pearl. For fully defined model (fixed parameters) $\hat{Y}$ can be determined by the following expression: \begin{align} @@ -239,8 +243,8 @@ where \outcome is the outcome recorded in the dataset \dataset. -\spara{Implementation} -The result of Equation~\ref{eq:theposterior} is computed numerically: +%\spara{Implementation} +The result of Equation~\ref{eq:theposterior} can be computed numerically: %\begin{equation} % \cfoutcome \approxeq \sum_{\parameters\in\sample}\prob{\outcome = 1 | \obsFeatures = \obsFeaturesValue, \decision = 1, \alpha, \beta_{_\obsFeatures}, \beta_{_\unobservable}, \unobservable} \label{eq:expandcf} %\end{equation} @@ -324,7 +328,9 @@ In practice, we use the MCMC functionality of Stan\footnote{\url{https://mc-stan %% %The Gaussians were restricted to the positive real numbers and both had mean $0$ and variance $\tau^2=1$ -- other values were tested but observed to have no effect. -\spara{Evaluation of decisions} +%\spara{Evaluation of decisions} + +\subsection{Evaluating Decision Makers} Expression~\ref{eq:expandcf} gives us a direct way to evaluate the outcome of decisions $\decision_\machine$ for any data entry for which $\decision_\human = 0$. % Note though that, unlike entries for which $\decision_\human = 1$ that takes integer values $\{0, 1\}$, \cfoutcome may take fractional values $\cfoutcome \in [0, 1]$.