Skip to content
Snippets Groups Projects
Commit 59251fec authored by Antti Hyttinen's avatar Antti Hyttinen
Browse files

...

parent c6f4cf5c
No related branches found
No related tags found
No related merge requests found
......@@ -88,24 +88,29 @@ We use the following causal model over the observed data, building on what is us
%\end{eqnarray*}
%\acomment{Where is the variance of X???}
According to the selective labels setting we have $Y=1$ whenever $T=0$. When $T=1$, subject behaviour is modeled as logistic regression over the risk factors and a noise term.
According to the selective labels setting we have $Y=1$ whenever $T=0$. When $T=1$, subject behaviour is modeled as logistic regression over the features and a noise term.
\begin{eqnarray}
\prob{\outcome=0~|~\decision, \obsFeatures, \unobservable} & =&
\prob{\outcome=0~|~\decision, \obsFeaturesValue, \unobservableValue} & =&
\begin{cases}
0,~\text{if}~\decision = 0\\ \invlogit(\alpha_\outcomeValue + \beta_\obsFeaturesValue \obsFeaturesValue + \beta_\unobservableValue \unobservableValue + \epsilon_\outcomeValue),~\text{o/w} \label{eq:defendantmodel}
0,~\text{if}~\decision = 0\\ \invlogit(\alpha_\outcome + \beta_\obsFeatures^T \obsFeaturesValue + \beta_\unobservable \unobservableValue %+ \epsilon_\outcome STAN DOESNT HAVE THESE???
),~\text{o/w} \label{eq:defendantmodel}
\end{cases}
\end{eqnarray}
Here \invlogit is the standard logistic function.
Here \invlogit is the standard logistic function. The observed features $\obsFeatures$ is a vector of different feature values.
% \rcomment{$\sigma$ is standard logistic function, which is the inverse of \emph{logit} function.}
%\acomment{What??? We have many decision makers used???}
Since the decision are ultimately based on expected behaviour, we model the decisions in the data similarly according to a logistic regression over the features and a noise term:
\begin{equation}
\prob{\decision = 0~|~\humanValue,\obsFeaturesValue, \unobservableValue} = \invlogit(\alpha_\humanValue + \gamma_\obsFeaturesValue\obsFeaturesValue + \gamma_\unobservableValue \unobservableValue + \epsilon_\decisionValue) \label{eq:judgemodel}
\prob{\decision = 0~|~\humanValue,\obsFeaturesValue, \unobservableValue} = \invlogit(\alpha_\humanValue + \gamma_\obsFeatures^T\obsFeaturesValue + \gamma_\unobservable \unobservableValue
%+ \epsilon_\decision STAN DOES NOT HAVE THESE
) \label{eq:judgemodel}
\end{equation}%
Intercept $\alpha_\human$ provides for the leniency of a decision maker $\human$ by $\logit(\leniencyValue_\human)$.
Note that we are making the simplifying assumption that coefficients $\gamma$ are the same for all defendants, but decision makers are allowed to differ in intercept $\alpha_\human \approx \logit(\leniencyValue_\human)$ so as to model varying leniency levels among them. \rcomment{As discussed with Antti, the relation $\alpha_\human$ and $\invlogit(r_\human)$ is merely illustrative. It can be said that the leniency is adjusted via $\alpha_\human$. Also as a note, the leniency of any \human is bound to its identity. So different deciders \human may or may not have different leniencies.}% (Eq. \ref{eq:leniencymodel}).
Note that we are here making the simplifying assumption that coefficients $\gamma_\obsFeatures,\gamma_\unobservable$ are the same for all defendants, but decision makers are allowed to differ in intercept $\alpha_\humanValue$.
Parameter $\alpha_{\humanValue}$ controls the leniency of a decision maker $\humanValue$. \acomment{now the epsilons used here and the appendix dont match!}
%$ \approx \logit(\leniencyValue_\human)$ so as to model varying leniency levels among them.
%\rcomment{As discussed with Antti, the relation $\alpha_\human$ and $\invlogit(r_\human)$ is merely illustrative. It can be said that the leniency is adjusted via $\alpha_\human$. Also as a note, the leniency of any \human is bound to its identity. So different deciders \human may or may not have different leniencies.}% (Eq. \ref{eq:leniencymodel}).
%The decision makers in the data differ from each other only by leniency.
%\noindent
......@@ -172,7 +177,7 @@ Note that we are making the simplifying assumption that coefficients $\gamma$ ar
%\spara{Parameter estimation}
We take a Bayesian approach to learn the model over the dataset \dataset.
%
In particular, we consider the full probabilistic model defined in Equations \ref{eq:judgemodel} -- \ref{eq:defendantmodel} and obtain the posterior distribution of its parameters $\parameters = \{ \alpha_\outcomeValue, \beta_\obsFeaturesValue, \beta_\unobservableValue, \gamma_\obsFeaturesValue, \gamma_\unobservableValue\} \cup\{\alpha_\human\}_\human$, which includes intercepts $\alpha_\human$ for all $\human$ employed in the data. % and $\alpha$ for all $\human$. %, where $i = 1, \ldots, \datasize$, conditional on the dataset.
In particular, we consider the full probabilistic model defined in Equations \ref{eq:judgemodel} -- \ref{eq:defendantmodel} and obtain the posterior distribution of its parameters $\parameters = \{ \alpha_\outcome, \beta_\obsFeatures, \beta_\unobservable, \gamma_\obsFeatures, \gamma_\unobservable\} \cup\{\alpha_\humanValue\}_\humanValue$, which includes intercepts $\alpha_\humanValue$ for all $\humanValue$ employed in the data. % and $\alpha$ for all $\human$. %, where $i = 1, \ldots, \datasize$, conditional on the dataset.
%
%Notice that by ``parameters'' here we refer to all quantities that are not considered as known with certainty from the input, and so parameters include unobserved features \unobservable.
%
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment