Skip to content
Snippets Groups Projects
Commit 32139568 authored by Antti Hyttinen's avatar Antti Hyttinen
Browse files

...

parent 3c15de65
No related branches found
No related tags found
No related merge requests found
......@@ -110,37 +110,49 @@ Taking into account that we need to learn parameters from the data we integrate
To make inference we obviously have to learn the parametric model from the data instead of fixed functions of the previous section. We can define the model as a probabilistic due to the simplification of the counterfactual expression in the previous section.
We assume feature vectors $\obsFeaturesValue$ and $\unobservableValue$ representing risk can be consensed to unidimension risk values, for example by ... . Furthermore we assume their distribution as Gaussian. Since $Z$ is unobserved we can assume its variance to be 1.
We assume feature vectors $\obsFeaturesValue$ and $\unobservableValue$ representing risk can be consensed to unidimension risk factors, for example by propensity scores. Furthermore we assume their distribution as Gaussian. Since $Z$ is unobserved we can assume its variance to be 1 without loss of generalization.
\begin{eqnarray*}
\unobservable &\sim& N(0,1), \quad \obsFeatures \sim N(0,\sigma_\obsFeatures^2)
\end{eqnarray*}
\acomment{Where is the variance of X???}
%
Note that index $j$ refers to decision maker $\human_j$ and \invlogit is the standard logistic function.
\noindent
\hrulefill
\begin{align}
\decision \sim \nonumber \\
\prob{\decision = 0~|~\leniency_j = \leniencyValue, \obsFeatures = \obsFeaturesValue, \unobservable = \unobservableValue} & = \invlogit(\alpha_j + \gamma_\obsFeaturesValue\obsFeaturesValue + \gamma_\unobservableValue \unobservableValue + \epsilon_\decisionValue), \label{eq:judgemodel} \\
\text{where}~ \alpha_{j} & \approx \logit(\leniencyValue_j) \label{eq:leniencymodel}\\
\prob{\outcome=0~|~\decision, \obsFeatures=\obsFeaturesValue, \unobservable=\unobservableValue} & =
According to the selective labels setting we have $Y=0$ whenever $T=0$. When $T=1$, subject behaviour is modeled as logistic regression over the risk factors and a noise terms.
\begin{eqnarray*}
\prob{\outcome=0~|~\decision, \obsFeatures=\obsFeaturesValue, \unobservable=\unobservableValue} & =&
\begin{cases}
0,~\text{if}~\decision = 0\\
\invlogit(\alpha_\outcomeValue + \beta_\obsFeaturesValue \obsFeaturesValue + \beta_\unobservableValue \unobservableValue + \epsilon_\outcomeValue),~\text{o/w} \label{eq:defendantmodel}
0,~\text{if}~\decision = 0\\ \invlogit(\alpha_\outcomeValue + \beta_\obsFeaturesValue \obsFeaturesValue + \beta_\unobservableValue \unobservableValue + \epsilon_\outcomeValue),~\text{o/w} \label{eq:defendantmodel}
\end{cases}
\end{align}
\hrulefill
\acomment{Unable to complete this!}
This gives us parameters:
$\parameters = \{ \alpha_\outcomeValue, \alpha_j, \beta_\obsFeaturesValue, \beta_\unobservableValue, \gamma_\obsFeaturesValue, \gamma_\unobservableValue\}$. \acomment{Where are the variance parameters?} Our estimate is simply integrating over the posterior of these variables.
\begin{eqnarray*}
E(Y)
&=& \int P(Y=1|T=1,x,z) P(z|R=r, T=0, x) P(\theta|D) dz d\theta \\
\end{eqnarray*}
Here \invlogit is the standard \acomment{inverse?} logistic function.
Since the decision are ultimately based on the risk factors for behaviour, we model the decisions similarly according to a logistic regression over the risk factors, leniency for the decision maker and a noise term:
\begin{eqnarray}
\prob{\decision = 0~|~\leniency_j = \leniencyValue, \obsFeatures = \obsFeaturesValue, \unobservable = \unobservableValue} & = & \invlogit(\alpha_j + \gamma_\obsFeaturesValue\obsFeaturesValue + \gamma_\unobservableValue \unobservableValue + \epsilon_\decisionValue) \label{eq:judgemodel}
\end{eqnarray}%
Note that index $j$ refers to decision maker $\human_j$. Parameter $\alpha_{j}$ provides for the leniency of a decision maker by $\logit(\leniencyValue_j)$. The decision makers in the data differ from each other only by leniency.
%\noindent
%\hrulefill
%\begin{align}
%\decision \sim \nonumber \\
%\prob{\decision = 0~|~\leniency_j = \leniencyValue, \obsFeatures = \obsFeaturesValue, \unobservable = \unobservableValue} & = \invlogit(\alpha_j + \gamma_\obsFeaturesValue\obsFeaturesValue + \gamma_\unobservableValue \unobservableValue + \epsilon_\decisionValue), \label{eq:judgemodel} \\
% \text{where}~ \alpha_{j} & \approx \logit(\leniencyValue_j) \label{eq:leniencymodel}\\
%
%\end{align}
%\hrulefill
%\acomment{Unable to complete this!}
%This gives us parameters:
% \acomment{Where are the variance parameters?} Our estimate is simply integrating over the posterior of these variables.
%
%\begin{eqnarray*}
%E(Y)
% &=& \int P(Y=1|T=1,x,z) P(z|R=r, T=0, x) P(\theta|D) dz d\theta \\
%\end{eqnarray*}
%where $\parameters = \{ \sigma_X^2, \alpha_\outcomeValue, \alpha_j, \beta_\obsFeaturesValue, \beta_\unobservableValue, \gamma_\obsFeaturesValue, \gamma_\unobservableValue\}$.
%where $\theta=\{\sigma_X^2,\alpha_j\}$
\acomment{I wonder if it makes sense to present as an integral. It is seems we integrate (sample) a whole lot of things.}
We use prior distributions given in Appendix for all parameters to ensure their identifiability.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment