Kuvien päivitys, synt pois tekstistä, analyyseissä joitain muutoksia

872b44fb · Riku-Laine · dc107546 · 872b44fb · 872b44fb · 872b44fb
Commit 872b44fb authored 5 years ago by Riku-Laine
--- a/Kandi.pdf
+++ b/Kandi.pdf
--- a/Kandi.synctex.gz
+++ b/Kandi.synctex.gz
--- a/Kandi.tex
+++ b/Kandi.tex
--- a/analysis_and_scripts/Bachelors_thesis_analyses.ipynb
+++ b/analysis_and_scripts/Bachelors_thesis_analyses.ipynb
--- a/analysis_and_scripts/Compas Analysis.ipynb
+++ b/analysis_and_scripts/Compas Analysis.ipynb
 {
 "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "toc": true
+   },
+   "source": [
+    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
+    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Loading-the-Data\" data-toc-modified-id=\"Loading-the-Data-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Loading the Data</a></span></li><li><span><a href=\"#Racial-Bias-in-Compas\" data-toc-modified-id=\"Racial-Bias-in-Compas-2\"><span class=\"toc-item-num\">2&nbsp;&nbsp;</span>Racial Bias in Compas</a></span><ul class=\"toc-item\"><li><span><a href=\"#Risk-of-Violent-Recidivism\" data-toc-modified-id=\"Risk-of-Violent-Recidivism-2.1\"><span class=\"toc-item-num\">2.1&nbsp;&nbsp;</span>Risk of Violent Recidivism</a></span></li></ul></li><li><span><a href=\"#Predictive-Accuracy-of-COMPAS\" data-toc-modified-id=\"Predictive-Accuracy-of-COMPAS-3\"><span class=\"toc-item-num\">3&nbsp;&nbsp;</span>Predictive Accuracy of COMPAS</a></span></li><li><span><a href=\"#Directions-of-the-Racial-Bias\" data-toc-modified-id=\"Directions-of-the-Racial-Bias-4\"><span class=\"toc-item-num\">4&nbsp;&nbsp;</span>Directions of the Racial Bias</a></span></li><li><span><a href=\"#Risk-of-Violent-Recidivism\" data-toc-modified-id=\"Risk-of-Violent-Recidivism-5\"><span class=\"toc-item-num\">5&nbsp;&nbsp;</span>Risk of Violent Recidivism</a></span></li><li><span><a href=\"#Gender-differences-in-Compas-scores\" data-toc-modified-id=\"Gender-differences-in-Compas-scores-6\"><span class=\"toc-item-num\">6&nbsp;&nbsp;</span>Gender differences in Compas scores</a></span></li></ul></div>"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -2383,6 +2393,48 @@
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": true,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": true,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": true
+  },
+  "varInspector": {
+   "cols": {
+    "lenName": 16,
+    "lenType": 16,
+    "lenVar": 40
+   },
+   "kernels_config": {
+    "python": {
+     "delete_cmd_postfix": "",
+     "delete_cmd_prefix": "del ",
+     "library": "var_list.py",
+     "varRefreshCmd": "print(var_dic_list())"
+    },
+    "r": {
+     "delete_cmd_postfix": ") ",
+     "delete_cmd_prefix": "rm(",
+     "library": "var_list.r",
+     "varRefreshCmd": "cat(var_dic_list()) "
+    }
+   },
+   "types_to_exclude": [
+    "module",
+    "function",
+    "builtin_function_or_method",
+    "instance",
+    "_Feature"
+   ],
+   "window_display": false
  }
 },
 "nbformat": 4,

 %% Cell type:markdown id: tags:
+<h1>Table of Contents<span class="tocSkip"></span></h1>
+<div class="toc"><ul class="toc-item"><li><span><a href="#Loading-the-Data" data-toc-modified-id="Loading-the-Data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Loading the Data</a></span></li><li><span><a href="#Racial-Bias-in-Compas" data-toc-modified-id="Racial-Bias-in-Compas-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Racial Bias in Compas</a></span><ul class="toc-item"><li><span><a href="#Risk-of-Violent-Recidivism" data-toc-modified-id="Risk-of-Violent-Recidivism-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Risk of Violent Recidivism</a></span></li></ul></li><li><span><a href="#Predictive-Accuracy-of-COMPAS" data-toc-modified-id="Predictive-Accuracy-of-COMPAS-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Predictive Accuracy of COMPAS</a></span></li><li><span><a href="#Directions-of-the-Racial-Bias" data-toc-modified-id="Directions-of-the-Racial-Bias-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Directions of the Racial Bias</a></span></li><li><span><a href="#Risk-of-Violent-Recidivism" data-toc-modified-id="Risk-of-Violent-Recidivism-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Risk of Violent Recidivism</a></span></li><li><span><a href="#Gender-differences-in-Compas-scores" data-toc-modified-id="Gender-differences-in-Compas-scores-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Gender differences in Compas scores</a></span></li></ul></div>
+%% Cell type:markdown id: tags:
 # Compas Analysis
 What follows are the calculations performed for ProPublica's analaysis of the COMPAS Recidivism Risk Scores. It might be helpful to open [the methodology](https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm/) in another tab to understand the following.
 ## Loading the Data
 We select fields for severity of charge, number of priors, demographics, age, sex, compas scores, and whether each person was accused of a crime within two years.
 %% Cell type:code id: tags:
 ``` python
 # filter dplyr warnings
 %load_ext rpy2.ipython
 import warnings
 warnings.filterwarnings('ignore')
 ```
 %% Cell type:code id: tags:
 ``` python
 %%R
 library(dplyr)
 library(ggplot2)
 raw_data <- read.csv("./compas-scores-two-years.csv")
 nrow(raw_data)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 However not all of the rows are useable for the first round of analysis.
 There are a number of reasons remove rows because of missing data:
 * If the charge date of a defendants Compas scored crime was not within 30 days from when the person was arrested, we assume that because of data quality reasons, that we do not have the right offense.
 * We coded the recidivist flag -- `is_recid` -- to be -1 if we could not find a compas case at all.
 * In a similar vein, ordinary traffic offenses -- those with a `c_charge_degree` of 'O' -- will not result in Jail time are removed (only two of them).
 * We filtered the underlying data from Broward county to include only those rows representing people who had either recidivated in two years, or had at least two years outside of a correctional facility.
 %% Cell type:code id: tags:
 ``` python
 %%R
 df <- dplyr::select(raw_data, age, c_charge_degree, race, age_cat, score_text, sex, priors_count,
                    days_b_screening_arrest, decile_score, is_recid, two_year_recid, c_jail_in, c_jail_out) %>%
        filter(days_b_screening_arrest <= 30) %>%
        filter(days_b_screening_arrest >= -30) %>%
        filter(is_recid != -1) %>%
        filter(c_charge_degree != "O") %>%
        filter(score_text != 'N/A')
 nrow(df)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Higher COMPAS scores are slightly correlated with a longer length of stay.
 %% Cell type:code id: tags:
 ``` python
 %%R
 df$length_of_stay <- as.numeric(as.Date(df$c_jail_out) - as.Date(df$c_jail_in))
 cor(df$length_of_stay, df$decile_score)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 After filtering we have the following demographic breakdown:
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$age_cat)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$race)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 print("Black defendants: %.2f%%" %            (3175 / 6172 * 100))
 print("White defendants: %.2f%%" %            (2103 / 6172 * 100))
 print("Hispanic defendants: %.2f%%" %         (509  / 6172 * 100))
 print("Asian defendants: %.2f%%" %            (31   / 6172 * 100))
 print("Native American defendants: %.2f%%" %  (11   / 6172 * 100))
 ```
 %% Output
    Black defendants: 51.44%
    White defendants: 34.07%
    Hispanic defendants: 8.25%
    Asian defendants: 0.50%
    Native American defendants: 0.18%
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$score_text)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 xtabs(~ sex + race, data=df)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$sex)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 print("Men: %.2f%%" %   (4997 / 6172 * 100))
 print("Women: %.2f%%" % (1175 / 6172 * 100))
 ```
 %% Output
    Men: 80.96%
    Women: 19.04%
 %% Cell type:code id: tags:
 ``` python
 %%R
 nrow(filter(df, two_year_recid == 1))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 nrow(filter(df, two_year_recid == 1)) / nrow(df) * 100
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Judges are often presented with two sets of scores from the Compas system -- one that classifies people into High, Medium and Low risk, and a corresponding decile score. There is a clear downward trend in the decile scores as those scores increase for white defendants.
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 363 -u px
 library(grid)
 library(gridExtra)
 pblack <- ggplot(data=filter(df, race =="African-American"), aes(ordered(decile_score))) +
          geom_bar() + xlab("Decile Score") +
          ylim(0, 650) + ggtitle("Black Defendant's Decile Scores")
 pwhite <- ggplot(data=filter(df, race =="Caucasian"), aes(ordered(decile_score))) +
          geom_bar() + xlab("Decile Score") +
          ylim(0, 650) + ggtitle("White Defendant's Decile Scores")
 grid.arrange(pblack, pwhite,  ncol = 2)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 xtabs(~ decile_score + race, data=df)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 ## Racial Bias in Compas
 After filtering out bad rows, our first question is whether there is a significant difference in Compas scores between races. To do so we need to change some variables into factors, and run a logistic regression, comparing low scores to high scores.
 %% Cell type:code id: tags:
 ``` python
 %%R
 df <- mutate(df, crime_factor = factor(c_charge_degree)) %>%
      mutate(age_factor = as.factor(age_cat)) %>%
      within(age_factor <- relevel(age_factor, ref = 1)) %>%
      mutate(race_factor = factor(race)) %>%
      within(race_factor <- relevel(race_factor, ref = 3)) %>%
      mutate(gender_factor = factor(sex, labels= c("Female","Male"))) %>%
      within(gender_factor <- relevel(gender_factor, ref = 2)) %>%
      mutate(score_factor = factor(score_text != "Low", labels = c("LowScore","HighScore")))
 model <- glm(score_factor ~ gender_factor + age_factor + race_factor +
                            priors_count + crime_factor + two_year_recid, family="binomial", data=df)
 summary(model)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Black defendants are 45% more likely than white defendants to receive a higher score correcting for the seriousness of their crime, previous arrests, and future criminal behavior.
 %% Cell type:code id: tags:
 ``` python
 %%R
 control <- exp(-1.52554) / (1 + exp(-1.52554))
 exp(0.47721) / (1 - control + (control * exp(0.47721)))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Women are 19.4% more likely than men to get a higher score.
 %% Cell type:code id: tags:
 ``` python
 %%R
 exp(0.22127) / (1 - control + (control * exp(0.22127)))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Most surprisingly, people under 25 are 2.5 times as likely to get a higher score as middle aged defendants.
 %% Cell type:code id: tags:
 ``` python
 %%R
 exp(1.30839) / (1 - control + (control * exp(1.30839)))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 ### Risk of Violent Recidivism
 Compas also offers a score that aims to measure a persons risk of violent recidivism, which has a similar overall accuracy to the Recidivism score. As before, we can use a logistic regression to test for racial bias.
 %% Cell type:code id: tags:
 ``` python
 %%R
 raw_data <- read.csv("./compas-scores-two-years-violent.csv")
 nrow(raw_data)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 df <- dplyr::select(raw_data, age, c_charge_degree, race, age_cat, v_score_text, sex, priors_count,
                    days_b_screening_arrest, v_decile_score, is_recid, two_year_recid) %>%
        filter(days_b_screening_arrest <= 30) %>%
        filter(days_b_screening_arrest >= -30) %>%
        filter(is_recid != -1) %>%
        filter(c_charge_degree != "O") %>%
        filter(v_score_text != 'N/A')
 nrow(df)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$age_cat)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$race)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(df$v_score_text)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 nrow(filter(df, two_year_recid == 1)) / nrow(df) * 100
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 nrow(filter(df, two_year_recid == 1))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 363 -u px
 library(grid)
 library(gridExtra)
 pblack <- ggplot(data=filter(df, race =="African-American"), aes(ordered(v_decile_score))) +
          geom_bar() + xlab("Violent Decile Score") +
          ylim(0, 700) + ggtitle("Black Defendant's Violent Decile Scores")
 pwhite <- ggplot(data=filter(df, race =="Caucasian"), aes(ordered(v_decile_score))) +
          geom_bar() + xlab("Violent Decile Score") +
          ylim(0, 700) + ggtitle("White Defendant's Violent Decile Scores")
 grid.arrange(pblack, pwhite,  ncol = 2)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 df <- mutate(df, crime_factor = factor(c_charge_degree)) %>%
      mutate(age_factor = as.factor(age_cat)) %>%
      within(age_factor <- relevel(age_factor, ref = 1)) %>%
      mutate(race_factor = factor(race,
                                  labels = c("African-American",
                                             "Asian",
                                             "Caucasian",
                                             "Hispanic",
                                             "Native American",
                                             "Other"))) %>%
      within(race_factor <- relevel(race_factor, ref = 3)) %>%
      mutate(gender_factor = factor(sex, labels= c("Female","Male"))) %>%
      within(gender_factor <- relevel(gender_factor, ref = 2)) %>%
      mutate(score_factor = factor(v_score_text != "Low", labels = c("LowScore","HighScore")))
 model <- glm(score_factor ~ gender_factor + age_factor + race_factor +
                            priors_count + crime_factor + two_year_recid, family="binomial", data=df)
 summary(model)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 The violent score overpredicts recidivism for black defendants by 77.3% compared to white defendants.
 %% Cell type:code id: tags:
 ``` python
 %%R
 control <- exp(-2.24274) / (1 + exp(-2.24274))
 exp(0.65893) / (1 - control + (control * exp(0.65893)))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Defendands under 25 are 7.4 times as likely to get a higher score as middle aged defendants.
 %% Cell type:code id: tags:
 ``` python
 %%R
 exp(3.14591) / (1 - control + (control * exp(3.14591)))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 ## Predictive Accuracy of COMPAS
 In order to test whether Compas scores do an accurate job of deciding whether an offender is Low, Medium or High risk,  we ran a Cox Proportional Hazards model. Northpointe, the company that created COMPAS and markets it to Law Enforcement, also ran a Cox model in their [validation study](http://cjb.sagepub.com/content/36/1/21.abstract).
 We used the counting model and removed people when they were incarcerated. Due to errors in the underlying jail data, we need to filter out 32 rows that have an end date more than the start date. Considering that there are 13,334 total rows in the data, such a small amount of errors will not affect the results.
 %% Cell type:code id: tags:
 ``` python
 %%R
 library(survival)
 library(ggfortify)
 data <- filter(filter(read.csv("./cox-parsed.csv"), score_text != "N/A"), end > start) %>%
        mutate(race_factor = factor(race,
                                  labels = c("African-American",
                                             "Asian",
                                             "Caucasian",
                                             "Hispanic",
                                             "Native American",
                                             "Other"))) %>%
        within(race_factor <- relevel(race_factor, ref = 3)) %>%
        mutate(score_factor = factor(score_text)) %>%
        within(score_factor <- relevel(score_factor, ref=2))
 grp <- data[!duplicated(data$id),]
 nrow(grp)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(grp$score_factor)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(grp$race_factor)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 f <- Surv(start, end, event, type="counting") ~ score_factor
 model <- coxph(f, data=data)
 summary(model)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 People placed in the High category are 3.5 times as likely to recidivate, and the COMPAS system's concordance 63.6%. This is lower than the accuracy quoted in the Northpoint study of 68%.
 %% Cell type:code id: tags:
 ``` python
 %%R
 decile_f <- Surv(start, end, event, type="counting") ~ decile_score
 dmodel <- coxph(decile_f, data=data)
 summary(dmodel)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 COMPAS's decile scores are a bit more accurate at 66%.
 We can test if the algorithm is behaving differently across races by including a race interaction term in the cox model.
 %% Cell type:code id: tags:
 ``` python
 %%R
 f2 <- Surv(start, end, event, type="counting") ~ race_factor + score_factor + race_factor * score_factor
 model <- coxph(f2, data=data)
 print(summary(model))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 The interaction term shows a similar disparity as the logistic regression above.
 High risk white defendants are 3.61 more likely than low risk white defendants, while High risk black defendants are 2.99 more likely than low.
 %% Cell type:code id: tags:
 ``` python
 import math
 print("Black High Hazard: %.2f" % (math.exp(-0.18976 + 1.28350)))
 print("White High Hazard: %.2f" % (math.exp(1.28350)))
 print("Black Medium Hazard: %.2f" % (math.exp(0.84286-0.17261)))
 print("White Medium Hazard: %.2f" % (math.exp(0.84286)))
 ```
 %% Output
    Black High Hazard: 2.99
    White High Hazard: 3.61
    Black Medium Hazard: 1.95
    White Medium Hazard: 2.32
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 563 -u px
 fit <- survfit(f, data=data)
 plotty <- function(fit, title) {
  return(autoplot(fit, conf.int=T, censor=F) + ggtitle(title) + ylim(0,1))
 }
 plotty(fit, "Overall")
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Black defendants do recidivate at higher rates according to race specific Kaplan Meier plots.
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 363 -u px
 white <- filter(data, race == "Caucasian")
 white_fit <- survfit(f, data=white)
 black <- filter(data, race == "African-American")
 black_fit <- survfit(f, data=black)
 grid.arrange(plotty(white_fit, "White defendants"),
             plotty(black_fit, "Black defendants"), ncol=2)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(fit, times=c(730))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(black_fit, times=c(730))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(white_fit, times=c(730))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Race specific models have similar concordance values.
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(coxph(f, data=white))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(coxph(f, data=black))
 ```
 %% Output
 %% Cell type:markdown id: tags:
 Compas's violent recidivism score has a slightly higher overall concordance score of 65.1%.
 %% Cell type:code id: tags:
 ``` python
 %%R
 violent_data <- filter(filter(read.csv("./cox-violent-parsed.csv"), score_text != "N/A"), end > start) %>%
        mutate(race_factor = factor(race,
                                  labels = c("African-American",
                                             "Asian",
                                             "Caucasian",
                                             "Hispanic",
                                             "Native American",
                                             "Other"))) %>%
        within(race_factor <- relevel(race_factor, ref = 3)) %>%
        mutate(score_factor = factor(score_text)) %>%
        within(score_factor <- relevel(score_factor, ref=2))
 vf <- Surv(start, end, event, type="counting") ~ score_factor
 vmodel <- coxph(vf, data=violent_data)
 vgrp <- violent_data[!duplicated(violent_data$id),]
 print(nrow(vgrp))
 summary(vmodel)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 In this case, there isn't a significant coefficient on African American's with High Scores.
 %% Cell type:code id: tags:
 ``` python
 %%R
 vf2 <- Surv(start, end, event, type="counting") ~ race_factor + race_factor * score_factor
 vmodel <- coxph(vf2, data=violent_data)
 summary(vmodel)
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(coxph(vf, data=filter(violent_data, race == "African-American")))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(coxph(vf, data=filter(violent_data, race == "Caucasian")))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 363 -u px
 white <- filter(violent_data, race == "Caucasian")
 white_fit <- survfit(vf, data=white)
 black <- filter(violent_data, race == "African-American")
 black_fit <- survfit(vf, data=black)
 grid.arrange(plotty(white_fit, "White defendants"),
             plotty(black_fit, "Black defendants"), ncol=2)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 ## Directions of the Racial Bias
 The above analysis shows that the Compas algorithm does overpredict African-American defendant's future recidivism, but we haven't yet explored the direction of the bias. We can discover fine differences in overprediction and underprediction by comparing Compas scores across racial lines.
 %% Cell type:code id: tags:
 ``` python
 from truth_tables import PeekyReader, Person, table, is_race, count, vtable, hightable, vhightable
 from csv import DictReader
 people = []
 with open("./cox-parsed.csv") as f:
    reader = PeekyReader(DictReader(f))
    try:
        while True:
            p = Person(reader)
            if p.valid:
                people.append(p)
    except StopIteration:
        pass
 pop = list(filter(lambda i: ((i.recidivist == True and i.lifetime <= 730) or
                              i.lifetime > 730), list(filter(lambda x: x.score_valid, people))))
 recid = list(filter(lambda i: i.recidivist == True and i.lifetime <= 730, pop))
 rset = set(recid)
 surv = [i for i in pop if i not in rset]
 ```
 %% Cell type:code id: tags:
 ``` python
 print("All defendants")
 table(list(recid), list(surv))
 ```
 %% Output
    All defendants
               	Low	High
    Survived   	2681	1282	0.55
    Recidivated	1216	2035	0.45
    Total: 7214.00
    False positive rate: 32.35
    False negative rate: 37.40
    Specificity: 0.68
    Sensitivity: 0.63
    Prevalence: 0.45
    PPV: 0.61
    NPV: 0.69
    LR+: 1.94
    LR-: 0.55
 %% Cell type:code id: tags:
 ``` python
 print("Total pop: %i" % (2681 + 1282 + 1216 + 2035))
 ```
 %% Output
    Total pop: 7214
 %% Cell type:code id: tags:
 ``` python
 import statistics
 print("Average followup time %.2f (sd %.2f)" % (statistics.mean(map(lambda i: i.lifetime, pop)),
                                                statistics.stdev(map(lambda i: i.lifetime, pop))))
 print("Median followup time %i" % (statistics.median(map(lambda i: i.lifetime, pop))))
 ```
 %% Output
    Average followup time 622.87 (sd 392.19)
    Median followup time 766
 %% Cell type:markdown id: tags:
 Overall, the false positive rate is 32.35%.
 %% Cell type:code id: tags:
 ``` python
 print("Black defendants")
 is_afam = is_race("African-American")
 table(list(filter(is_afam, recid)), list(filter(is_afam, surv)))
 ```
 %% Output
    Black defendants
               	Low	High
    Survived   	990	805	0.49
    Recidivated	532	1369	0.51
    Total: 3696.00
    False positive rate: 44.85
    False negative rate: 27.99
    Specificity: 0.55
    Sensitivity: 0.72
    Prevalence: 0.51
    PPV: 0.63
    NPV: 0.65
    LR+: 1.61
    LR-: 0.51
 %% Cell type:markdown id: tags:
 That number is higher for African Americans at 44.85%.
 %% Cell type:code id: tags:
 ``` python
 print("White defendants")
 is_white = is_race("Caucasian")
 table(list(filter(is_white, recid)), list(filter(is_white, surv)))
 ```
 %% Output
    White defendants
               	Low	High
    Survived   	1139	349	0.61
    Recidivated	461	505	0.39
    Total: 2454.00
    False positive rate: 23.45
    False negative rate: 47.72
    Specificity: 0.77
    Sensitivity: 0.52
    Prevalence: 0.39
    PPV: 0.59
    NPV: 0.71
    LR+: 2.23
    LR-: 0.62
 %% Cell type:markdown id: tags:
 And lower for whites at 23.45%.
 %% Cell type:code id: tags:
 ``` python
 44.85 / 23.45
 ```
 %% Output
    1.9125799573560769
 %% Cell type:markdown id: tags:
 Which means under COMPAS black defendants are 91% more likely to get a higher score and not go on to commit more crimes than white defendants after two year.
 %% Cell type:markdown id: tags:
 COMPAS scores misclassify white reoffenders as low risk at 70.4% more often than black reoffenders.
 %% Cell type:code id: tags:
 ``` python
 47.72 / 27.99
 ```
 %% Output
    1.7048946052161487
 %% Cell type:code id: tags:
 ``` python
 hightable(list(filter(is_white, recid)), list(filter(is_white, surv)))
 ```
 %% Output
               	Low	High
    Survived   	1407	81	0.61
    Recidivated	771	195	0.39
    Total: 2454.00
    False positive rate: 5.44
    False negative rate: 79.81
    Specificity: 0.95
    Sensitivity: 0.20
    Prevalence: 0.39
    PPV: 0.71
    NPV: 0.65
    LR+: 3.71
    LR-: 0.84
 %% Cell type:code id: tags:
 ``` python
 hightable(list(filter(is_afam, recid)), list(filter(is_afam, surv)))
 ```
 %% Output
               	Low	High
    Survived   	1511	284	0.49
    Recidivated	1160	741	0.51
    Total: 3696.00
    False positive rate: 15.82
    False negative rate: 61.02
    Specificity: 0.84
    Sensitivity: 0.39
    Prevalence: 0.51
    PPV: 0.72
    NPV: 0.57
    LR+: 2.46
    LR-: 0.72
 %% Cell type:markdown id: tags:
 ## Risk of Violent Recidivism
 Compas also offers a score that aims to measure a persons risk of violent recidivism, which has a similar overall accuracy to the Recidivism score.
 %% Cell type:code id: tags:
 ``` python
 vpeople = []
 with open("./cox-violent-parsed.csv") as f:
    reader = PeekyReader(DictReader(f))
    try:
        while True:
            p = Person(reader)
            if p.valid:
                vpeople.append(p)
    except StopIteration:
        pass
 vpop = list(filter(lambda i: ((i.violent_recidivist == True and i.lifetime <= 730) or
                              i.lifetime > 730), list(filter(lambda x: x.vscore_valid, vpeople))))
 vrecid = list(filter(lambda i: i.violent_recidivist == True and i.lifetime <= 730, vpeople))
 vrset = set(vrecid)
 vsurv = [i for i in vpop if i not in vrset]
 ```
 %% Cell type:code id: tags:
 ``` python
 print("All defendants")
 vtable(list(vrecid), list(vsurv))
 ```
 %% Output
    All defendants
               	Low	High
    Survived   	4121	1597	0.89
    Recidivated	347	389	0.11
    Total: 6454.00
    False positive rate: 27.93
    False negative rate: 47.15
    Specificity: 0.72
    Sensitivity: 0.53
    Prevalence: 0.11
    PPV: 0.20
    NPV: 0.92
    LR+: 1.89
    LR-: 0.65
 %% Cell type:markdown id: tags:
 Even moreso for Black defendants.
 %% Cell type:code id: tags:
 ``` python
 print("Black defendants")
 is_afam = is_race("African-American")
 vtable(list(filter(is_afam, vrecid)), list(filter(is_afam, vsurv)))
 ```
 %% Output
    Black defendants
               	Low	High
    Survived   	1692	1043	0.86
    Recidivated	170	273	0.14
    Total: 3178.00
    False positive rate: 38.14
    False negative rate: 38.37
    Specificity: 0.62
    Sensitivity: 0.62
    Prevalence: 0.14
    PPV: 0.21
    NPV: 0.91
    LR+: 1.62
    LR-: 0.62
 %% Cell type:code id: tags:
 ``` python
 print("White defendants")
 is_white = is_race("Caucasian")
 vtable(list(filter(is_white, vrecid)), list(filter(is_white, vsurv)))
 ```
 %% Output
    White defendants
               	Low	High
    Survived   	1679	380	0.91
    Recidivated	129	77	0.09
    Total: 2265.00
    False positive rate: 18.46
    False negative rate: 62.62
    Specificity: 0.82
    Sensitivity: 0.37
    Prevalence: 0.09
    PPV: 0.17
    NPV: 0.93
    LR+: 2.03
    LR-: 0.77
 %% Cell type:markdown id: tags:
 Black defendants are twice as likely to be false positives for a Higher violent score than white defendants.
 %% Cell type:code id: tags:
 ``` python
 38.14 / 18.46
 ```
 %% Output
    2.066088840736728
 %% Cell type:markdown id: tags:
 White defendants are 63% more likely to get a lower score and commit another crime than Black defendants.
 %% Cell type:code id: tags:
 ``` python
 62.62 / 38.37
 ```
 %% Output
    1.63200416992442
 %% Cell type:markdown id: tags:
 ## Gender differences in Compas scores
 In terms of underlying recidivism rates, we can look at gender specific Kaplan Meier estimates. There is a striking difference between women and men.
 %% Cell type:code id: tags:
 ``` python
 %%R
 female <- filter(data, sex == "Female")
 male   <- filter(data, sex == "Male")
 male_fit <- survfit(f, data=male)
 female_fit <- survfit(f, data=female)
 ```
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(male_fit, times=c(730))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R
 summary(female_fit, times=c(730))
 ```
 %% Output
 %% Cell type:code id: tags:
 ``` python
 %%R -w 900 -h 363 -u px
 grid.arrange(plotty(female_fit, "Female"), plotty(male_fit, "Male"),ncol=2)
 ```
 %% Output
 %% Cell type:markdown id: tags:
 As these plots show, the Compas score treats a High risk women the same as a Medium risk man.
 %% Cell type:code id: tags:
 ``` python
 ```

--- a/analysis_and_scripts/tree.dot
+++ b/analysis_and_scripts/tree.dot
--- a/analysis_and_scripts/tree.png
+++ b/analysis_and_scripts/tree.png
--- a/figures/valikoitumis_iso.jpg
+++ b/figures/valikoitumis_iso.jpg
--- a/figures/valikoitumisharha.png
+++ b/figures/valikoitumisharha.png
--- a/figures/valikoitumisharha_kaaavio.drawio
+++ b/figures/valikoitumisharha_kaaavio.drawio
+<mxfile modified="2019-04-21T15:58:55.500Z" host="www.draw.io" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0" etag="NAxnU8qYrcP3EQhgaOI9" version="10.6.3" type="device"><diagram id="3qIiofentZYx9hMsOPwP" name="Page-1">7VpZc+MoEP41qpp9yJaErvjRcTKT2a05qrJXHlmLsRhLwoOQj/31CxJIQsfasWXF2fJLLBpoaPi+7gZi2LN4+4HCVfiJBCgygBlsDfveAMAyLZP/CMmukLieWwgWFAeyUSV4wv8g1VNKMxygVGvICIkYXunCOUkSNGeaDFJKNnqzbyTSR13BBWoJnuYwakv/xAELC+kt8Cv5I8KLUI1seZOiJoaqsbQkDWFANjWR/WDYM0oIK77i7QxFYvHUuhT93vfUlhOjKGGHdPjds25+QRb+uHx+frynn39zl3/cSC1rGGXSYAN4Edd3941wtXzWbCeXwvuREVVxk+YbNeUNLH+1rSr510L+5lr+VoKvxsw2pk7xl4mfO2+Z8qmjJf5eSBNDWFD0w6rfdyXhhuFKuxmiZImjQo9U3ewqbK+prCtYknSJ1lCsX8e80q6JMCVJs9IoXNNdE/dMuVwT2tGwq3OxBUoMtN0AlGRJgMTeWrx6E2KGnlZwLmo3nIpcFrI4ktVC03sY40iwcEbiVcYQFSDlfKVJqb4OJoUMRBna1kQSXB8QiRGjO95E1toK6JLp1q0sbyreAEfKwhpnPCmDkqqLUnWFZv4hAf0CcIMzgfuxhr1O0OIeBLAsIp3g2p0KrjeKGeABDTO2BVqYsdwOzDjnwoxzxcyFY8ZRGcTFYMbtwExjFVESTEU2wkvzCKYpnusLh7aY/cW/zZ9dWXqu1dwL001V2KlCwidfdPJdVX6uV1b98pLq2LsJKcnoHO1PFxikC8T2e14UaMlVe0trW9a1Y0pGUQQZXuspWdc2yhG+EpxzVUWmiR6ZWkgo7Ja96ilUU5HTUDRpKCoWpqUoR1Vp9vFA814daODCgOZcNtDcY4HW9HHmuEDze6NggNcHBUGPB8E8yDTi4KedjIEM4QQl/Rl4GZ3yEfWYdeokarH0BC3v9CPBCZp6w7aZqzOtn+pnhwEM70kEtKVuOBYenZnuSlJGyRLNSEQolyQkQWImOIoaIhjhRSL8kTjtcfmdiPWYH7GnsiLGQSCG6cwuqvzDlIbKSwLLe6UkFbgHJhxN9g+WcNyejZ4vTnb7FP2KUXQwxfUEdAzmn2zfW3Mh5kgu5JCtvFjP8p+epOl6BjnK9OQLr+ZZJoN4lhZoviQJTlkWF/5gKIIPQMo2I3fHMHKciH4N54OEc1snne13hHPTH5F06gJjaNY9yLhLruwbIBhe2TfIK4Ht72efZY/JPvtM7BM8COEaYsayOt6GznPfMBEto/t9rhx8BZPjR2cQ15aiHLrn4tzsmWs+0c/TPWwvZjrY+l8d1Gs5qNbzQpeDGjUntwZ9XwBH3fuOdO0rDdt77SuPKZdy7asSOAUZdUH00mtf4OrYc0BD0Zmvfa1BHxj8/wPS1NZeCtT8yTBQs4GO2dGh1nXTMO5b1khIcw5Fmn1RSHPdxjHZOxJprms3AupQb1m8WP37XNG8+idE++Ff</diagram></mxfile>
\ No newline at end of file
--- a/viitteet.bib
+++ b/viitteet.bib
@@ -140,4 +140,12 @@
  year = "2016",
  language={finnish},
  note = {viitattu 5.4.2019}
+} 
+@article{madras18,
+  title={Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data},
+  author={Madras, David and Creager, Elliot and Pitassi, Toniann and Zemel, Richard},
+  journal={arXiv preprint arXiv:1809.02519},
+  year={2018},
+  language={finnish}
 } 
\ No newline at end of file