Causal approach to selective labels
Repository for the research project called Causal approach to selective labels with the required data sets. This repository has originally been forked from the ProPublica repository for COMPAS analysis.
Structure of the repository
The contents of the repository is divided into four main folders:
-
analysis_and_scripts
contains the scripts and notebooks for performing the analysis. Additionally, the folder containsnotes.tex
file which contains much of the different research done. -
data
folder contains the original data sets from the ProPublica analysis (see below for more information) -
figures
contains the figures used for the notes file mentioned earlier. This folder is also used for the figures for the BSocSc thesis. -
paper
contains the draft for a research publication.
Original README:
+---------+
Didn't Reoffend |____|____|
Reoffended | | |
+---------+
This repository contains a Jupyter notebook and data for the ProPublica story "Machine Bias."
Story:
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing/
Methodology:
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm/
Notebook (you'll probably want to follow along in the methodology):
https://github.com/propublica/compas-analysis/blob/master/Compas%20Analysis.ipynb
Main Dataset:
compas.db - a sqlite3 database containing criminal history, jail and prison time, demographics and COMPAS risk scores for defendants from Broward County.
Other files as needed for the analysis.