@@ -96,9 +96,10 @@ In the formalism of causal inference~\cite{pearl2010introduction}, we wish to ev
The expression above concerns a specific entry in the dataset with features $\obsFeatures=x$, for which $\human_j$ made a decision $\decision_{\human_j}=0$.
It is read as follows: conditional on what we know from the data entry ($\obsFeatures=\obsFeaturesValue$, $\decision_{\human_j}=0$) as well as from the entire dataset \dataset
, the probability that the outcome would have been positive ($\outcome=1$) %in the hypothetical case %we had intervened to make
had the decision been positive ($\decision_{H_j}=1$).
%It is read as follows: conditional on what we know from the data entry ($\obsFeatures = \obsFeaturesValue$, $\decision_{\human_j} = 0$) as well as from the entire dataset \dataset
%, the probability that the outcome would have been positive ($\outcome = 1$) %in the hypothetical case %we had intervened to make
%had the decision been positive ($\decision_{H_j} = 1$).
It expresses the probability that the outcome would have been positive ($\outcome=1$) had the decision been positive ($\decision_{H_j}=1$), conditional on what we know from the data entry ($\obsFeatures=\obsFeaturesValue$, $\decision_{\human_j}=0$) as well as from the entire dataset \dataset.
Notice that the presence of \dataset in the conditional part of~\ref{eq:counterfactual} gives us more information about the data entry compared to the entry-specific quantities ($\obsFeatures=\obsFeaturesValue$, $\decision_{\human_j}=0$) and is thus not redundant.
@@ -204,7 +205,7 @@ In particular, we consider the full probabilistic model defined in Equations \re
%Notice that by ``parameters'' here we refer to all quantities that are not considered as known with certainty from the input, and so parameters include unobserved features \unobservable.
We use prior distributions given in Appendix~X to ensure the identifiability of the parameters.
We use prior distributions given in Appendix~X to ensure the identifiability of the parameters.
as derived in detail in Appendix~X. In essence, we determine the distribution of the unobserved features $Z$ using the decision, observed features, and the leniency of the employed decision maker, and then determine the distribution of $Y$ conditional on all features.
Having obtained a posterior probability distribution for parameters \parameters: % in parameter space \parameterSpace, we can now expand expression~(\ref{eq:counterfactual}) as follows.
@@ -228,22 +229,18 @@ Having obtained a posterior probability distribution for parameters \parameters:
%%Antti dont want to put specific parameters since P(z|...) depends on so many?
%The value of the first factor in the integrand of the expression above is provided by the model in Equation~\ref{eq:defendantmodel}, while the second is sampled by MCMC, as explained above.
Note that, for all data entries other than the ones with $\decision_\human=0$ and $\decision_\machine=1$, we trivially have
%The value of the first factor in the integrand of the expression above is provided by the model in Equation~\ref{eq:defendantmodel}, while the second is sampled by MCMC, as explained above.
Note that, for all data entries other than the ones with $\decision_{\human_j}=0$ and $\decision_\machine=1$, we trivially have \cfoutcome = \outcome
where \outcome is the outcome recorded in the dataset \dataset.
The result is computed numerically over the sample.
The result of Equation~\ref{eq:theposterior}is computed numerically:
@@ -255,9 +252,10 @@ The result is computed numerically over the sample.
where the sums are taken over samples of $\parameters$ and $z$ obtained from their respective posteriors.
In practice, we use the MCMC functionality of Stan\footnote{\url{https://mc-stan.org/}} to obtain a sample \sample of this posterior distribution, where each element of \sample contains one instance of parameters \parameters.
In practice, we use the MCMC functionality of Stan\footnote{\url{https://mc-stan.org/}}.
% to obtain a sample \sample of this posterior distribution, where each element of \sample contains one instance of parameters \parameters.
Sample \sample can now be used to compute various probabilistic quantities of interest, including a (posterior) distribution of \unobservable for each entry in dataset \dataset.
%Sample \sample can now be used to compute various probabilistic quantities of interest, including a (posterior) distribution of \unobservable for each entry in dataset \dataset.