Skip to contents

Create an ROC curve from observed data

Usage

compute_empirical_roc(
  data,
  response,
  predictor,
  direction = "auto",
  best_weights = c(1, 0.5),
  ...
)

Arguments

data

a dataframe containing responses (groupings) and predictor variable

response

a bare column name with the group status (control vs. cases)

predictor

a bare column name with the predictor to use for classification

direction

direction to set for the for pROC::roc(). Defaults to "auto".

best_weights

weights for computing the best ROC curve points. Defaults to c(1, .5), which are the defaults used by pROC::coords().

...

additional arguments passed to pROC::roc().

Value

a new dataframe of ROC coordinates is returned with columns for the predictor variable, .sensitivities, .specificities, .auc, .direction, .controls, .cases, .n_controls, .n_cases, .is_best_youden and .is_best_closest_topleft.

Examples

set.seed(100)
x1 <- rnorm(100, 4, 1)
x2 <- rnorm(100, 2, .5)
both <- c(x1, x2)
steps <- seq(min(both), max(both), length.out = 200)
d1 <- dnorm(steps, mean(x1), sd(x1))
d2 <- dnorm(steps, mean(x2), sd(x2))
data <- tibble::tibble(
  y = steps,
  d1 = d1,
  d2 = d2,
  outcome = rbinom(200, 1, prob = 1 - (d1 / (d1 + d2))),
  group = ifelse(outcome, "case", "control")
)

# get an ROC on the fake data
compute_empirical_roc(data, outcome, y)
#> Setting levels: control = 0, case = 1
#> Setting direction: controls > cases
# this guess the cases and controls from the group name and gets it wrong
compute_empirical_roc(data, group, y)
#> Setting levels: control = case, case = control
#> Setting direction: controls < cases
# better
compute_empirical_roc(data, group, y, levels = c("control", "case"))
#> Setting direction: controls > cases