From 3137ed44a059938d92d99cbc2028bbf763b1bc0d Mon Sep 17 00:00:00 2001 From: Riku-Laine <28960190+Riku-Laine@users.noreply.github.com> Date: Mon, 17 Jun 2019 14:56:24 +0300 Subject: [PATCH] Typo correction and abstract additionsd --- analysis_and_scripts/notes.tex | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/analysis_and_scripts/notes.tex b/analysis_and_scripts/notes.tex index 9896a64..ec0d21e 100644 --- a/analysis_and_scripts/notes.tex +++ b/analysis_and_scripts/notes.tex @@ -71,7 +71,7 @@ \graphicspath{ {../figures/} } \title{Notes} -\author{RL, 14 June 2019} +\author{RL, 17 June 2019} %\date{} % Activate to display a given date or no date \begin{document} @@ -83,7 +83,7 @@ \tableofcontents \begin{abstract} -This document presents the implementations of RL in pseudocode level. First, I present the nomenclature used in these notes. Then I proceed to give my personal views and comments on the motivation behind Selective labels paper. In the following sections, I present the data generating algorithms and algorithms for obtaining failure rates using different methods. In the end I present some some results that I was asked to present in the meeting Friday $7^{th}$. +This document presents the implementations of RL in pseudocode level. First, I present the nomenclature used in these notes. Then I proceed to give my personal views and comments on the motivation behind Selective labels paper. In chapter 2, I define the framework for this problem and give the required definitions. In the following sections, I present the data generating algorithms and algorithms for obtaining failure rates using different methods. Finally in the last section, I present results using multiple different settings. \end{abstract} \section*{Terms and abbreviations} @@ -128,7 +128,7 @@ If $c$ is defined so that the ratio of positive decisions to all decisions will Finally, chapter from Lakkaraju \cite{lakkaraju17} about counterfactual inference, see references from their paper [sic]: \begin{quote} -Counterfactual inference. Counterfactual inference techniques have been used extensively to estimate treatment effects in observational studies. These techniques have found applications in a variety of fields such as machine learning, epidemiology, and sociology [3, 8–10, 30, 34]. Along the lines of Johansson et al. [16], counterfactual inference techniques can be broadly categorized as: (1) parametric methods which model the relationship between observed features, treatments, and outcomes. Examples include any type of regression model such as linear and logistic regression, random forests and regression trees [12, 33, 42]. (2) non-parametric methods such as propensity score matching, nearest-neighbor matching, which do not explicitly model the relationship between observed features, treatments, and outcomes [4, 15, 35, 36, 41]. (3) doubly robust methods which combine the two aforementioned classes of techniques typically via a propensity score weighted regression [5, 10]. The effectiveness of parametric and non-parametric methods depends on the postulated regression model and the postulated propensity score model respectively. If the postulated models are not identical to the true models, then these techniques result in biased estimates of outcomes. Doubly robust methods require only one of the postulated models to be identical to the true model in order to generate unbiased estimates. However, due to the presence of unobservables, we cannot guarantee that either of the postulated models will be identical to the true models. +Counterfactual inference. Counterfactual inference techniques have been used extensively to estimate treatment effects in observational studies. These techniques have found applications in a variety of fields such as machine learning, epidemiology, and sociology [3, 8--10, 30, 34]. Along the lines of Johansson et al. [16], counterfactual inference techniques can be broadly categorized as: (1) parametric methods which model the relationship between observed features, treatments, and outcomes. Examples include any type of regression model such as linear and logistic regression, random forests and regression trees [12, 33, 42]. (2) non-parametric methods such as propensity score matching, nearest-neighbor matching, which do not explicitly model the relationship between observed features, treatments, and outcomes [4, 15, 35, 36, 41]. (3) doubly robust methods which combine the two aforementioned classes of techniques typically via a propensity score weighted regression [5, 10]. The effectiveness of parametric and non-parametric methods depends on the postulated regression model and the postulated propensity score model respectively. If the postulated models are not identical to the true models, then these techniques result in biased estimates of outcomes. Doubly robust methods require only one of the postulated models to be identical to the true model in order to generate unbiased estimates. However, due to the presence of unobservables, we cannot guarantee that either of the postulated models will be identical to the true models. \end{quote} \section{Framework definition -- 13 June discussion} \label{sec:framework} -- GitLab