# *Causal approach to selective labels* Repository for the research project called *Causal approach to selective labels* with the required data sets. This repository has originally been forked from [the ProPublica repository for COMPAS analysis](https://github.com/propublica/compas-analysis). ## Structure of the repository The contents of the repository is divided into four main folders: * `analysis_and_scripts` contains the scripts and notebooks for performing the analysis. Additionally, the folder contains `notes.tex` file which contains much of the different research done. * `data` folder contains the original data sets from the ProPublica analysis (see below for more information) * `figures` contains the figures used for the notes file mentioned earlier. This folder is also used for the figures for the BSocSc thesis. * `paper` contains the draft for a research publication. Original README: ``` Low High +---------+ Didn't Reoffend |____|____| Reoffended | | | +---------+ This repository contains a Jupyter notebook and data for the ProPublica story "Machine Bias." Story: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing/ Methodology: https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm/ Notebook (you'll probably want to follow along in the methodology): https://github.com/propublica/compas-analysis/blob/master/Compas%20Analysis.ipynb Main Dataset: compas.db - a sqlite3 database containing criminal history, jail and prison time, demographics and COMPAS risk scores for defendants from Broward County. Other files as needed for the analysis.