Title: | Stratified Evaluation of Subgroup Classification Accuracy |
---|---|
Description: | Enables simultaneous statistical inference for the accuracy of multiple classifiers in multiple subgroups (strata). For instance, allows to perform multiple comparisons in diagnostic accuracy studies with co-primary endpoints sensitivity and specificity (Westphal M, Zapf A. Statistical inference for diagnostic test accuracy studies with multiple comparisons. Statistical Methods in Medical Research. 2024;0(0). <doi:10.1177/09622802241236933>). |
Authors: | Max Westphal [aut, cre] |
Maintainer: | Max Westphal <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.0 |
Built: | 2024-10-16 05:33:20 UTC |
Source: | https://github.com/maxwestphal/cases |
cases
packageEnables simultaneous statistical inference for the accuracy of multiple classifiers in multiple subgroups (strata). For instance, allows to perform multiple comparisons in diagnostic accuracy studies with co-primary endpoints sensitivity (true positive rate, TPR) and specificity (true negative rate, TNR).
The package functionality and syntax is described in the vignettes, see examples.
Maintainer: Max Westphal [email protected] (ORCID)
Westphal M, Zapf A. Statistical inference for diagnostic test accuracy studies with multiple comparisons. Statistical Methods in Medical Research. 2024;0(0). doi:10.1177/09622802241236933
Useful links:
Report bugs at https://github.com/maxwestphal/cases/issues
# overview over package functionality: vignette("package_overview") # a real-world data example: vignette("example_wdbc")
# overview over package functionality: vignette("package_overview") # a real-world data example: vignette("example_wdbc")
This function allows to split continuous values, e.g. (risk) scores or (bio)markers, into two or more categories by specifying one or more cutoff values.
categorize( values, cutoffs = rep(0, ncol(values)), map = 1:ncol(values), labels = NULL )
categorize( values, cutoffs = rep(0, ncol(values)), map = 1:ncol(values), labels = NULL )
values |
(matrix) |
cutoffs |
(numeric | matrix) |
map |
(numeric) |
labels |
(character) |
(matrix)
numeric (n x k) matrix with categorical outcomes after categorizing.
set.seed(123) M <- as.data.frame(mvtnorm::rmvnorm(20, mean = rep(0, 3), sigma = 2 * diag(3))) M categorize(M) C <- matrix(rep(c(-1, 0, 1, -2, 0, 2), 3), ncol = 3, byrow = TRUE) C w <- c(1, 1, 2, 2, 3, 3) categorize(M, C, w)
set.seed(123) M <- as.data.frame(mvtnorm::rmvnorm(20, mean = rep(0, 3), sigma = 2 * diag(3))) M categorize(M) C <- matrix(rep(c(-1, 0, 1, -2, 0, 2), 3), ncol = 3, byrow = TRUE) C w <- c(1, 1, 2, 2, 3, 3) categorize(M, C, w)
Compare predictions and labels
compare( predictions, labels, partition = TRUE, names = c(specificity = 0, sensitivity = 1) )
compare( predictions, labels, partition = TRUE, names = c(specificity = 0, sensitivity = 1) )
predictions |
(numeric | character) |
labels |
(numeric | character) |
partition |
(logical) |
names |
(character | NULL) |
(list | matrix)
list of matrices (one for each unique value of labels
) with
values 1 (correct prediction) and 0 (false prediction).
If partition=TRUE
, the matrices are combined in a single matrix with rbind
.
pred <- matrix(c(1, 1, 0), 5, 3) labels <- c(1, 1, 0, 0, 1) compare(pred, labels, FALSE) compare(pred, labels, TRUE)
pred <- matrix(c(1, 1, 0), 5, 3) labels <- c(1, 1, 0, 0, 1) compare(pred, labels, FALSE) compare(pred, labels, TRUE)
Create an AR(1) correlation matrix
cormat_ar1(m, rho, d = TRUE)
cormat_ar1(m, rho, d = TRUE)
m |
(numeric) |
rho |
(numeric) |
d |
(logical | numeric) |
(matrix)
AR(1) correlation matrix R with entries
Create an equicorrelation matrix
cormat_equi(m, rho, d = TRUE)
cormat_equi(m, rho, d = TRUE)
m |
(numeric) |
rho |
(numeric) |
d |
(logical | numeric) |
(matrix)
AR(1) correlation matrix R with entries
Dataset documentation can be found at the source website and references below.
data_wdbc
data_wdbc
data_wdbc
A data frame with 569 rows (patients) and 31 columns (1 target, 30 features).
The ID variable was removed. Diagnosis (1= malignant, 0 = benign). Feature variables have been renamed.
https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic)
W.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.
O.L. Mangasarian, W.N. Street and W.H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pages 570-577, July-August 1995.
Define a contrast (matrix) to specify exact hypothesis system
define_contrast(type = c("raw", "one", "all"), comparator = NA)
define_contrast(type = c("raw", "one", "all"), comparator = NA)
type |
(character) |
comparator |
(numeric | character) |
"raw" contrast: compare all candidates against specified benchmark values
"one" ('all vs. one' or 'Dunnett') contrast: compare all candidates to a single comparator.
"all" ('all vs. all' or 'Tukey') contrast: compare all candidates against each other.
(cases_contrast
)
object to be passed to evaluate
define_contrast("one", 1)
define_contrast("one", 1)
Generate binary data
draw_data( n = 200, prev = c(0.5, 0.5), random = FALSE, m = 10, method = c("roc", "lfc", "pr"), pars = list(), ... )
draw_data( n = 200, prev = c(0.5, 0.5), random = FALSE, m = 10, method = c("roc", "lfc", "pr"), pars = list(), ... )
n |
(numeric) |
prev |
(numeric) |
random |
(logical) |
m |
(numeric) |
method |
(character) |
pars |
(list) |
... |
(any) |
(matrix)
generated binary data (possibly stratified for subgroups)
draw_data()
draw_data()
Generate binary data (LFC model)
draw_data_lfc( n = 100, prev = c(0.5, 0.5), random = FALSE, m = 10, se = 0.8, sp = 0.8, B = round(m/2), L = 1, Rse = diag(rep(1, m)), Rsp = diag(rep(1, m)), modnames = paste0("model", 1:m), ... )
draw_data_lfc( n = 100, prev = c(0.5, 0.5), random = FALSE, m = 10, se = 0.8, sp = 0.8, B = round(m/2), L = 1, Rse = diag(rep(1, m)), Rsp = diag(rep(1, m)), modnames = paste0("model", 1:m), ... )
n |
(numeric) |
prev |
(numeric) |
random |
(logical) |
m |
(numeric) |
se |
(numeric) |
sp |
(numeric) |
B |
(numeric) |
L |
(numeric) |
Rse |
(matrix) |
Rsp |
(maxtrix) |
modnames |
(modnames) |
... |
(any) |
(list)
list of matrices including generated binary datasets
(1: correct prediction, 0: incorrect prediction) for each subgroup (specificity, sensitivity)
data <- draw_data_lfc() head(data)
data <- draw_data_lfc() head(data)
This function is wrapper for rmvbin
.
draw_data_prb(n = 100, pr = c(0.8, 0.8), R = diag(length(pr)))
draw_data_prb(n = 100, pr = c(0.8, 0.8), R = diag(length(pr)))
n |
(numeric) |
pr |
(numeric) |
R |
(matrix) |
(matrix)
matrix with n rows and length(pr) columns of randomly generated binary (0, 1) data
Generate binary data (ROC model)
draw_data_roc( n = 100, prev = c(0.5, 0.5), random = FALSE, m = 10, auc = seq(0.85, 0.95, length.out = 5), rho = c(0.25, 0.25), dist = c("normal", "exponential"), e = 10, k = 100, delta = 0, modnames = paste0("model", 1:m), corrplot = FALSE, ... )
draw_data_roc( n = 100, prev = c(0.5, 0.5), random = FALSE, m = 10, auc = seq(0.85, 0.95, length.out = 5), rho = c(0.25, 0.25), dist = c("normal", "exponential"), e = 10, k = 100, delta = 0, modnames = paste0("model", 1:m), corrplot = FALSE, ... )
n |
(numeric) |
prev |
(numeric) |
random |
(logical) |
m |
(numeric) |
auc |
(numeric) |
rho |
(numeric) |
dist |
(character) |
e |
(numeric) |
k |
(numeric) |
delta |
(numeric) |
modnames |
(character) |
corrplot |
(logical) |
... |
(any) |
(list)
list of matrices including generated binary datasets
(1: correct prediction, 0: incorrect prediction) for each subgroup (specificity, sensitivity)
data <- draw_data_roc() head(data)
data <- draw_data_roc() head(data)
Assess classification accuracy of multiple classifcation rules stratified by subgroups, e.g. in diseased (sensitivity) and healthy (specificity) individuals.
evaluate( data, contrast = define_contrast("raw"), benchmark = 0.5, alpha = 0.05, alternative = c("two.sided", "greater", "less"), adjustment = c("none", "bonferroni", "maxt", "bootstrap", "mbeta"), transformation = c("none", "logit", "arcsin"), analysis = c("co-primary", "full"), regu = FALSE, pars = list(), ... )
evaluate( data, contrast = define_contrast("raw"), benchmark = 0.5, alpha = 0.05, alternative = c("two.sided", "greater", "less"), adjustment = c("none", "bonferroni", "maxt", "bootstrap", "mbeta"), transformation = c("none", "logit", "arcsin"), analysis = c("co-primary", "full"), regu = FALSE, pars = list(), ... )
data |
(list) |
contrast |
( |
benchmark |
(numeric) |
alpha |
(numeric) |
alternative |
(character) |
adjustment |
(character) |
transformation |
(character) |
analysis |
(character) |
regu |
(numeric | logical) |
pars |
(list) |
... |
(any) |
Adjustment methods (adjustment
) and additional parameters (pars
or ...
):
"none" (default): no adjustment for multiplicity
"bonferroni": Bonferroni adjustment
"maxt": maxT adjustment, based on a multivariate normal approximation of the vector of test statistics
"bootstrap": Bootstrap approach
nboot: number of bootstrap draws (default: 2000)
type: type of bootstrap, "pairs" (default) or "wild"
dist: residual distribution for wild bootstrap, "Normal" (default) or "Rademacher"
proj_est: should bootstrapped estimates for wild bootstrap be projected into unit interval? (default: TRUE)
res_tra: type of residual transformation for wild boostrap, 0,1,2 or 3 (default: 0 = no transformation) (for details on res_tra options, see this presentation by James G. MacKinnon (2012) and references therein)
"mbeta": A heuristic Bayesian approach which is based on a multivariate beta-binomial model.
nrep: number of posterior draws (default: 5000)
lfc_pr: prior probability of 'least-favorable parameter configuration' (default: 1 if analysis == "co-primary", 0 if analysis == "full").
(cases_results
)
list of analysis results including (adjusted) confidence intervals and p-values
# data <- draw_data_roc() evaluate(data)
# data <- draw_data_roc() evaluate(data)
Generates a (simulation) instance, a list of multiple datasets to be processed (analyzed) with process_instance. Ground truth parameters (Sensitvity & Specificity) are least-favorable in the sense that the type-I error rate of the subsequently applied multiple test procedures is maximized.
This function is only needed for simulation via batchtools, not relevant in interactive use!
generate_instance_lfc( nrep = 10, n = 100, prev = 0.5, random = FALSE, m = 10, se = 0.8, sp = 0.8, L = 1, rhose = 0, rhosp = 0, cortype = "equi", ..., data = NULL, job = NULL )
generate_instance_lfc( nrep = 10, n = 100, prev = 0.5, random = FALSE, m = 10, se = 0.8, sp = 0.8, L = 1, rhose = 0, rhosp = 0, cortype = "equi", ..., data = NULL, job = NULL )
nrep |
(numeric) |
n |
(numeric) |
prev |
(numeric) |
random |
(logical) |
m |
(numeric) |
se |
(numeric) |
sp |
(numeric) |
L |
(numeric) |
rhose |
(numeric) |
rhosp |
(numeric) |
cortype |
(character) |
... |
(any) |
data |
(NULL) |
job |
(NULL) |
Utilizes same arguments as draw_data_lfc unless mentioned otherwise above.
(list)
a single (LFC) simulation instance of length nrep
Generates a (simulation) instance, a list of multiple datasets to be processed (analyzed) with process_instance. Ground truth parameters (Sensitvity & Specificity) are initially generated according to a generative model whereby multiple decision rules (with different parameter values) are derived by thresholding multiple biomarkers.
This function is only needed for simulation via batchtools, not relevant in interactive use!
generate_instance_roc( nrep = 10, n = 100, prev = 0.5, random = FALSE, m = 10, auc = "seq(0.85, 0.95, length.out = 5)", rhose = 0.5, rhosp = 0.5, dist = "normal", e = 10, k = 100, delta = 0, ..., data = NULL, job = NULL )
generate_instance_roc( nrep = 10, n = 100, prev = 0.5, random = FALSE, m = 10, auc = "seq(0.85, 0.95, length.out = 5)", rhose = 0.5, rhosp = 0.5, dist = "normal", e = 10, k = 100, delta = 0, ..., data = NULL, job = NULL )
nrep |
(numeric) |
n |
(numeric) |
prev |
(numeric) |
random |
(logical) |
m |
(numeric) |
auc |
(numeric) |
rhose |
(numeric) |
rhosp |
(numeric) |
dist |
(character) |
e |
(numeric) |
k |
(numeric) |
delta |
(numeric) |
... |
(any) |
data |
(NULL) |
job |
(NULL) |
Utilizes same arguments as draw_data_roc unless mentioned otherwise above.
(list)
a single (ROC) simulation instance of length nrep
Process data instances, a list of multiple datasets generated via generate_instance_lfc or generate_instance_roc. This function applies evaluate to all datasets.
This function is only needed for simulation via batchtools, not relevant in interactive use!
process_instance( instance = NULL, contrast = "cases::define_contrast('raw', NA)", benchmark = 0.5, alpha = 0.05, alternative = "greater", adjustment = "none", transformation = "none", analysis = "co-primary", regu = "c(1,1/2,1/4)", pars = "list()", ..., data = NULL, job = list(id = NA) )
process_instance( instance = NULL, contrast = "cases::define_contrast('raw', NA)", benchmark = 0.5, alpha = 0.05, alternative = "greater", adjustment = "none", transformation = "none", analysis = "co-primary", regu = "c(1,1/2,1/4)", pars = "list()", ..., data = NULL, job = list(id = NA) )
instance |
(list) |
contrast |
( |
benchmark |
(numeric) |
alpha |
(numeric) |
alternative |
(character) |
adjustment |
(character) |
transformation |
(character) |
analysis |
(character) |
regu |
(numeric | logical) |
pars |
(list) |
... |
(any) |
data |
(NULL) |
job |
(NULL) |
Utilizes same arguments as evaluate unless mentioned otherwise above.
(list)
standardized evaluation results
Currently, this implementation is only intended for situations with ...
two groups (e.g. healthy (<-> specificity) and diseased (<-> sensitivity))
alternative = "greater"
contrast = define_contrast("raw)
visualize(x, ...)
visualize(x, ...)
x |
|
... |
|
a ggplot