Data for good talk at Columbia Data Science Institute

(Note: links don’t work in this preview, click on the post to view). I’m happy to be speaking at 1pm EST today at Columbia University on the topics of causal inference and selection bias in algorithmic fairness. I believe video will be available at the webinar link, and here are my slides. The talk is based on work described in this survey with my coauthors Matt Kusner, Chris Russell, and Ricardo Silva. See here for the video for Matt’s oral presentation of our first paper in this line of work at NIPS 2017. [Read More]

A conditional approach to inference after model selection

Model selection can invalidate inference, such as significance tests, but statisticians have recently made progress developing methods to adjust for this bias. One approach uses conditional probability, adjusting inferences by conditioning on selecting the chosen model. This post motivates the conditional approach with a simple screening rule example and introduces the selectiveInference R package that can compute adjusted significance tests after popular model selection methods like forward stepwise and LASSO. [Read More]

Model selection bias invalidates significance tests

People often do regression model selection, either by hand or using algorithms like forward stepwise or the lasso. Sometimes they also report significance tests for the variables in the chosen model. After all, a significant p-value means they’ve found something real. But there’s a problem: the reason for that significant p-value may just be something called model selection bias. This bias can invalidate inferences done after model selection, and may be one of the contributors to the reproducibility crisis in science. Adjusting inference methods to account for model selection is an area of ongoing research where I have done some work. [Read More]