What's the Risk? | Ben Cunningham

What's the Risk?

Written by Ben Cunningham · on November 26, 2016

I’ve been using R for a little over two years now. One of the most redeeming things about learning it has always been the little eureka moments along the way where I’ve used it to put some received wisdom to the test. Sometimes just stemming from daydreams of my undergrad stats lectures, I’ll find myself wondering to what degree I really understand XYZ (and now that I work in industry, I’ll also wonder if I understand it enough to explain to clients and colleagues). But previously without a tool, I never knew how to “prove” these things to myself.

One bit of received wisdom I’ve been thinking about a lot lately is my implanted intuition for analyzing confidence intervals (and maybe more generally, significance). It was drilled into my head as a student that population parameters are fixed, while it’s statistics that “move”, failing to capture the true population characteristic at the rate of α. Good instructors (and being shown this xkcd strip endlessly every syllabus week) made the concept easy to grasp, but it wasn’t until recently that I realized I hadn’t ever seen it plainly in action. With R, it just dawned on me, I could pave my own path.

So in my newfound spirit of putting conceptual faces to names, here’s a mini-experiment I designed in R to test this out:

library(broom)
library(dplyr)
library(ggplot2)
library(purrr)
library(tidyr)

sample_t_test <- function(x, mu, size, conf.level) {
  sample(x, size) %>% t.test(mu = mu, conf.level = conf.level)
}

set.seed(1126)

p <- rnorm(1e6)
conf.level <- 0.95
mu <- mean(p)

df <-
  data_frame(
    iter = 1:50,
    test = map(iter, function(x) {
      sample_t_test(p, mu, 500, conf.level) %>% glance()
    })
  ) %>%
  unnest(test) %>%
  mutate(
    type_i = (p.value < (1 - conf.level))
  )

In simple English, what I did was simulate a population of a million observations and perform 50 t-tests (sample n = 500) where the null hypothesis is that μ equals, well… μ (the known mean of our population, calculated ahead of time as mean(p)). If we believe the intuition handed down by my beloved stats profs, at our chosen confidence level of 95%, we should expect around two or three tests to have not captured the population mean (on their own, these would ultimately mislead us to reject H0 and commit a Type I error). Let’s visualize the results and see if that’s the case:

ggplot(df, aes(x = iter)) +
  geom_hline(aes(yintercept = mean(p)), col = "black") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high, col = type_i)) +
  geom_point(aes(y = estimate, col = type_i)) +
  scale_color_manual(values=c("grey", "red")) +
  labs(x = "Test Iteration", y = "Estimated Mean (95% Confidence Bars)") +
  theme(legend.position = "none")

plot of chunk unnamed-chunk-2

And there we go. I’m not sure how helpful this is to others, but (for as obvious as it may sound to say) this is just another reminder to me that R is a convenient vehicle for fiddling with the “basic” concepts that I accept but don’t necessarily understand.