@@ -99,12 +100,7 @@ It is read as follows: conditional on what we know from the data entry ($\obsFea
, consider the probability that the outcome would have been positive ($\outcome=1$) in the hypothetical case %we had intervened to make
the decision was positive.
%
Notice that the presence of \dataset in the conditional part of~\ref{eq:counterfactual} gives us more information about the data entry compared to the entry-specific quantities ($\obsFeatures=\obsFeaturesValue$, $\decision_\human=0$) and is thus not redundant.
%
In particular, it provides information about the leniency and other parameters of decider \human, which in turn is important to infer information about the unobserved variables \unobservable, as discussed in the beginning of this section.
...
...
@@ -133,9 +129,9 @@ Here \invlogit is the standard \acomment{inverse?} logistic function.
%\acomment{What??? We have many decision makers used???}
Since the decision are ultimately based on the risk factors for behaviour, we model the decisions in the data similarly according to a logistic regression over the risk factors, leniency for the decision maker and a noise term:
Note that index $j$ refers to decision maker $\human_j$. Parameter $\alpha_{j}$ provides for the leniency of a decision maker by $\logit(\leniencyValue_j)$.
Note that we are making the simplifying assumption that coefficients $\gamma$ are the same for all defendants, but decision makers are allowed to differ in intercept $\alpha_j \approx\logit(\leniencyValue_j)$ so as to model varying leniency levels among them. % (Eq. \ref{eq:leniencymodel}).
We use prior distributions given in Appendix for all parameters to ensure their identifiability.
...
...
@@ -207,7 +203,7 @@ We use prior distributions given in Appendix for all parameters to ensure their
%
In particular, we consider the full probabilistic model defined in Equations \ref{eq:judgemodel} -- \ref{eq:defendantmodel} and obtain the posterior distribution of its parameters $\parameters=\{\alpha_\outcomeValue, \beta_\obsFeaturesValue, \beta_\unobservableValue, \alpha_j, \gamma_\obsFeaturesValue, \gamma_\unobservableValue\}$. %, where $i = 1, \ldots, \datasize$, conditional on the dataset.
%
Notice that by ``parameters'' here we refer to all quantities that are not considered as known with certainty from the input, and so parameters include unobserved features \unobservable.
%Notice that by ``parameters'' here we refer to all quantities that are not considered as known with certainty from the input, and so parameters include unobserved features \unobservable.
%
Formally, we obtain
\begin{equation}
...
...
@@ -220,13 +216,13 @@ Sample \sample can now be used to compute various probabilistic quantities of in
\spara{Computing counterfactuals}
Having obtained a posterior probability distribution for parameters \parameters in parameter space \parameterSpace, we can now expand expression~(\ref{eq:counterfactual}) as follows.