class: center, middle, inverse, title-slide # Orientation To Machine Learning ###
Jacob M. Montgomery
Washington University in St. Louis
Department of Politcal Science
### 2020 --- ## Orientation for today 1. We will not take the whole class, so stay with me for a bit. 2. I will then give you the midterm and talk through all of the rules. --- ## Overview of machine learning In the broadest overview, data science is a complex field that mixes together several inter-related tasks. 1. Data collection. Intentional and observational. -- 2. Algorithms for data reduction. + "Supervised" + "Unsupervised" + Some gray areas in between -- 3. Inference (probability theory). + Frequentist + Maximum likelihood + Bayesian + Non-parametric -- 4. Prediction. + Within the data you have. + For new data observed from the same process. + For new data collected under new/unobserved circumstances. --- ## Start with the end goal, and think backwards. - There is no way to do this "right." But there are better and worse ways to go about this. - But the appropriate tools/approaches are going to change depending on what you are actuaally trying to do. --- ## Example 1: > Question: What percent of the Facebook ads that I have collected ask for donations/contributions? A reasonable approach: - Randomly sample from set of Facebook ads I have. - Label that subsample. - Use an algorithm that can: + Identify important features (e.g., "Donate now!") + "Learn" how they are associated with outcome - Use resulting model to label/predict the rest of the sample. + Might also draw on some branch of statistics to include estimataes of uncertainty. --- ## Example 2: > Question: What percent of Facebook ads ask for donations/contributions? - Draw a sample from the broader set of relevant Facebook ads. Do this so you know how "representative" it may be. + Contrary to popular belief, *more* data is not always as valuable as *better* data in this setting. - Label that subsample. - Use an algorithm that can: + Identify important features (e.g., "Donate now!") + "Learn" how they are associated with outcome - Use knowledge of how the sample was constructed combined with probability theory to draw conclusions about the composition of the remainging (unseen) ads. --- ## Example 3: > Question: What percent of Facebook will ask for donations and contributions if Facebook starts banning microtargeting by political compaigns? - Now you are using data you have to draw conclusions about data you do not have. - And the question is fundamentally unanswerable from the data alone. + We need a theory of how this system works. + We are in the end going to be basing our conclusion on some sort of causal model. + We want to know how variables are related to each other, including in situations we have not yet observed. - There are several different approaches: + Survey experiments of ad purchasers. + Build a theory of how purchasers are targeting (based on your data) and hypothesize out how they will responde to fewer options. + Intervene in one geographic area and see how the system responds. - Note that none of these involves fitting a model to some dataset and building an a theoretic prediction model. --- ## Discussion 1. The tough issue is that if you use the approach from example 1 to answer question 3, you are much more likely to come up with the wrong conclusion. 2. Machine learning/statistics have a ton of options for Example 1 and some very good options for Example 2 (provided you have the right kind of sample). 3. But most CEOs, policymakers, and scientists are in the end interested in questions like Example 3. And here is where options are thinner, mistakes are easy, and no amount of technical skill can replace knowledge, theory, and research design. --- ## Overview of the next part of this class - We are going to take a running glance through these issues over the next few weeks. - The goal is to get you ready to ask and answer a reasonable questions for your final project. - We will focus on the Baumer, Kaplan, Horton book. - Be aware that some of these lectures could be entire classes or even cirriculums. -- 1. Basics of probability theory and relationship to statistical learning. 2. Overview of supervised machine learning. 3. Clustering, data reduction, and measurement. 4. Causality in experiments. 5. Causality in observational settings.