scR estimates empirical sample complexity bounds for
supervised learning tasks. The core workflow is:
estimate_accuracy();interpolate_scb();
andlibrary(scR)
mylogit <- function(formula, data) {
structure(
glm(formula = formula, data = data, family = binomial(link = "logit")),
class = c("svrclass", "glm")
)
}
mypred <- function(m, newdata) {
p <- predict.glm(m, newdata, type = "response")
factor(ifelse(p > 0.5, 1, 0), levels = c("0", "1"))
}
# In applied work, pass your observed data instead of generating synthetic data.
dat <- gendata(mylogit, dim = 3, maxn = 250, predictfn = mypred)
results <- estimate_accuracy(
y ~ .,
mylogit,
data = dat,
predictfn = mypred,
nsample = 10,
steps = 25,
parallel = FALSE,
backend = "sequential"
)
scbhat <- interpolate_scb(
list(results),
epsilon = 0.05,
delta = 0.05,
maxN = nrow(dat)
)
summary(scbhat)
plot(scbhat, list(results), plot_type = "Delta")The package also includes the monotone-integrated Gaussian process
extrapolator used in the paper appendix. This is an optional
nonparametric robustness check. It requires a working CmdStan
installation plus the cmdstanr and posterior
packages. These are not hard dependencies of scR, so the
core package can be installed and checked without a Stan toolchain.
# Requires cmdstanr, posterior, and CmdStan.
gp_delta <- interpolate_scb_gp(
results,
epsilon = 0.05,
delta = 0.05,
maxN = nrow(dat),
curve = "delta",
M_grid = 80
)
summary(gp_delta)
plot(gp_delta, plot_type = "Delta")The GP implementation uses the paper’s monotone-integrated construction: a Gaussian process is placed on an unconstrained latent field, a softplus transform produces a nonnegative derivative, the derivative is integrated on a fixed grid, and the resulting latent curve is mapped to either the delta or epsilon mean curve.