Hello! 👋
I study how children with motor disorders learn to speak and communicate.
Bayesian stats let me handle repeated-measures, time-series data from heterogeneous populations.
To get my cool model to work, I needed diagnostics…
Try to predict race time from race distance and hill height.
stan_glm(time_min ~ distance_km, data = races, ...)
races
#> # A tibble: 90 x 4
#> race distance_km climb_km time_min
#> <chr> <dbl> <dbl> <dbl>
#> 1 Alva Games Hill Race 2.5 0.385 18.6
#> 2 Aonach Mor Uphill Race 4 0.61 22.2
#> 3 Arrochar Alps 25 2.4 188.
#> 4 Beinn Lora Hill Race 5 0.34 26.8
#> 5 Ben Aigan Hill Race 6.4 0.326 28.5
#> 6 Ben Lomond Hill Race 12.6 0.98 62.3
#> 7 Ben Nevis Race 14 1.36 85.6
#> 8 Ben Rinnes Hill Race 22.4 1.57 117
#> 9 Ben Sheann Hill Race 4 0.426 22.9
#> 10 Bennachie Hill Race 12.8 0.55 55.2
#> # ... with 80 more rows
Classical regression: line of best fit (maximum likelihood)
Bayesian regression: all plausible lines given data and data-generating process (posterior distribution)
mcmc_intervals_data(m1_draws) %>%
glimpse()
#> Observations: 3
#> Variables: 9
#> $ parameter <fct> (Intercept), distance_km, sigma
#> $ outer_width <dbl> 0.9, 0.9, 0.9
#> $ inner_width <dbl> 0.5, 0.5, 0.5
#> $ point_est <chr> "median", "median", "median"
#> $ ll <dbl> -11.001341, 5.503966, 11.515591
#> $ l <dbl> -8.614441, 5.688012, 12.376087
#> $ m <dbl> -6.994535, 5.805850, 13.016453
#> $ h <dbl> -5.393244, 5.921104, 13.701015
#> $ hh <dbl> -3.084268, 6.105477, 14.862735
Plus dozens more plots