| Title: | Directly Adjusted Estimates |
|---|---|
| Description: | Compute estimates and confidence intervals of weighted averages quickly and easily. Weighted averages are computed using data.table for speed. Confidence intervals are approximated using the delta method with either using known formulae or via algorithmic or numerical derivation. |
| Authors: | Joonas Miettinen [cre, aut] (ORCID: <https://orcid.org/0000-0001-8624-6754>) |
| Maintainer: | Joonas Miettinen <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.6.1 |
| Built: | 2026-06-04 11:10:25 UTC |
| Source: | https://github.com/finnishcancerregistry/directadjusting |
Functions to compute confidence intervals.
delta_method_confidence_intervals( statistics, variances, conf_lvl = 0.95, conf_method = "identity" )delta_method_confidence_intervals( statistics, variances, conf_lvl = 0.95, conf_method = "identity" )
statistics |
Statistics for which to calculate confidence intervals. |
||||||||||||||||||||
variances |
Variance estimates of |
||||||||||||||||||||
conf_lvl |
Confidence level of confidence intervals in |
||||||||||||||||||||
conf_method |
Delta method transformation to be applied.
|
directadjusting::delta_method_confidence_intervals
Returns a data.table with columns
c("statistic", "variance", "ci_lo", "ci_hi").
directadjusting::delta_method_confidence_intervals
directadjusting::delta_method_confidence_intervals can be used to
compute confidence intervals using the delta method. The following steps
are performed:
Compute confidence intervals based on conf_method, statistics,
variances, and conf_lvl.
If conf_method is a string, a pre-defined set of mathematical
expressions are used to compute the confidence intervals.
If conf_method is a call, it is evaluated with the variables
theta, theta_variance, theta_standard_error, and z. This is
done once for the lower and once for the upper bound of the confidence
interval, so for the lower bound and conf_level = 0.95
we use z = stats::qnorm(p = (1 - conf_lvl) / 2).
If conf_method is a list, it must contain elements g and
g_inv, e.g. list(g = quote(log(theta)), g_inv = quote(exp(g))).
g is passed to [stats::deriv]. If that fails, a numerical
derivative is computed.
With the derivative known the variance after the transformation
is variance * g_gradient ^ 2.
With the transformed variance known the transform confidence interval
is calculated simply via g(theta) + g_standard_error * z.
These transformation-scale confidence intervals are then converted
back to the original scale using g_inv.
Collect a data.table with the confidence intervals and with also
the columns statistics = statistics and variance = variances.
Add attribute named ci_meta to the data.table.
This attribute is a list which contains elements conf_lvl and
conf_method.
Return data.table with columns
c("statistic", "variance", "ci_lo", "ci_hi").
# directadjusting::delta_method_confidence_intervals dt_1 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = "log" ) # you can also supply your own math for computing the confidence intervals dt_2 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = quote(theta * exp(z * theta_standard_error / theta)) ) dt_3 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = list( g = quote(log(theta)), g_inv = quote(exp(g)) ) ) dt_4 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = list( g = quote(stats::qnorm(theta)), g_inv = quote(stats::pnorm(g)) ) ) stopifnot( all.equal(dt_1, dt_2, check.attributes = FALSE), all.equal(dt_1, dt_3, check.attributes = FALSE) )# directadjusting::delta_method_confidence_intervals dt_1 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = "log" ) # you can also supply your own math for computing the confidence intervals dt_2 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = quote(theta * exp(z * theta_standard_error / theta)) ) dt_3 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = list( g = quote(log(theta)), g_inv = quote(exp(g)) ) ) dt_4 <- directadjusting::delta_method_confidence_intervals( statistics = 0.9, variances = 0.1, conf_lvl = 0.95, conf_method = list( g = quote(stats::qnorm(theta)), g_inv = quote(stats::pnorm(g)) ) ) stopifnot( all.equal(dt_1, dt_2, check.attributes = FALSE), all.equal(dt_1, dt_3, check.attributes = FALSE) )
Compute direct adjusted estimates from a table of statistics.
directly_adjusted_estimates( stats_dt, stat_col_nms, var_col_nms, stratum_col_nms = NULL, adjust_col_nms = NULL, conf_lvls = 0.95, conf_methods = "identity", weights = NULL )directly_adjusted_estimates( stats_dt, stat_col_nms, var_col_nms, stratum_col_nms = NULL, adjust_col_nms = NULL, conf_lvls = 0.95, conf_methods = "identity", weights = NULL )
stats_dt |
a |
stat_col_nms |
names of columns in |
var_col_nms |
|
stratum_col_nms |
names of columns in |
adjust_col_nms |
Names of columns in
|
conf_lvls |
confidence levels for confidence intervals; you may specify each statistic
(see |
conf_methods |
Method(s) to compute confidence intervals. Either one method for all stats
( Can also be |
weights |
The weights need not sum to one as this is ensured internally. You may supply weights in one of the following ways:
|
directadjusting::directly_adjusted_estimates computes weighted
averages and their confidence intervals. Performs the following steps:
Makes a new data.table with data from stats_dt without copying any
column data to avoid modifying stats_dt itself.
Handles argument weights in order to produce a data.table of weights
if it wasn't one already.
Inserts the weights into stats_dt.
Weights are merged into stats_dt in-place by making a left join
on weights_dt using stats_dt and adding column weight resulting
from this join into stats_dt.
Re-scale weights to sum to one within each stratum defined by
stratum_col_nms.
Computes weighted averages of stat_col_nms and var_col_nms
(the latter with squared weights because they are variances)
over adjust_col_nms. This results in a data.table without column(s)
adjust_col_nms.
For each i in seq_along(stat_col_nm):
If conf_methods[[i]] is "none", doesn't compute confidence
intervals.
Otherwise calls [delta_method_confidence_intervals].
Sets attribute directly_adjusted_estimates_meta. It is a list
containing:
call: The call to directadjusting::directly_adjusted_estimates.
stat_col_nms: The argument as given by the user.
var_col_nms: The argument as given by the user.
stratum_col_nms: The argument as given by the user.
adjust_col_nms: The argument as given by the user.
conf_lvls: The argument, but always of length length(stat_col_nms).
conf_methods: The argument, but always of length
length(stat_col_nms).
Returns a data.table. Returned columns are those given via
stratum_col_nms, stat_col_nms, and var_col_nms.
Returns a data.table. Returned columns are those given via
stratum_col_nms, stat_col_nms, and var_col_nms.
# directadjusting::directly_adjusted_estimates library("data.table") set.seed(1337) offsets <- rnorm(8, mean = 1000, sd = 100) baseline <- 100 hrs_by_sex <- rep(1:2, each = 4) hrs_by_ag <- rep(c(0.75, 0.90, 1.10, 1.25), times = 2) counts <- rpois(8, baseline * hrs_by_sex * hrs_by_ag) # raw estimates my_stats <- data.table::data.table( sex = rep(1:2, each = 4), ag = rep(1:4, times = 2), e = counts / offsets, v = counts / (offsets ** 2) ) # adjusted by age group my_adj_stats <- directly_adjusted_estimates( stats_dt = my_stats, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "log", stratum_col_nms = "sex", adjust_col_nms = "ag", weights = c(200, 300, 400, 100) ) # adjusted by smaller age groups, stratified by larger age groups my_stats[, "ag2" := c(1,1, 2,2, 1,1, 2,2)] my_adj_stats <- directly_adjusted_estimates( stats_dt = my_stats, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "log", stratum_col_nms = c("sex", "ag2"), adjust_col_nms = "ag", weights = c(200, 300, 400, 100) ) # with no adjusting columns defined you get the same table as input # but with confidence intervals. this for the sake of # convenience for programming cases where sometimes you want to adjust, # sometimes not. stats_dt_2 <- data.table::data.table( sex = 0:1, e = 0.0, v = 0.1 ) dt_2 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_2, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "identity", stratum_col_nms = "sex" ) stopifnot( dt_2[["e"]] == stats_dt_2[["e"]], dt_2[["v"]] == stats_dt_2[["v"]], dt_2[["sex"]] == stats_dt_2[["sex"]] ) # sometimes when adjusting rates or counts, there can be strata where the # statistic is zero. these should be included in your statistics dataset # if you still want the weighted average be influenced by the zero. # otherwise you will get the wrong result. sometimes when naively tabulating # a dataset with e.g. dt[, .N, keyby = "stratum"] one does not get a result # row for a stratum that does not appear in the dataset even if we know that # the stratum exists, for instance only the age groups 1-17 are present in # the dataset. stats_dt_3 <- data.table::data.table( age_group = 1:18, count = 17:0, var = 17:0 ) # this goes as intended dt_3 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_3, stat_col_nms = "count", var_col_nms = "var", stratum_col_nms = NULL, adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) # this does not dt_4 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_3[1:17, ], stat_col_nms = "count", var_col_nms = "var", stratum_col_nms = NULL, adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) # the weighted average that included the zero is smaller stopifnot( dt_3[["count"]] < dt_4[["count"]] ) # NAs are allowed and produce in turn NAs silently. stats_dt_5 <- data.table::data.table( age_group = 1:18, count = c(NA, 16:0), var = c(NA, 16:0) ) dt_5 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_5, stat_col_nms = "count", var_col_nms = "var", adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) stopifnot( is.na(dt_5) ) stats_dt_6 <- data.table::data.table( age_group = 1:4, survival = c(0.20, 0.40, 0.60, 0.80), var = 0.05 ^ 2 ) # you can use conf_method to pass whatever to # `delta_method_confidence_intervals`. dt_6 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_6, stat_col_nms = "survival", var_col_nms = "var", adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:4, weight = 1:4 ), conf_methods = list( list( g = quote(stats::qnorm(theta)), g_inv = quote(stats::pnorm(g)) ) ) )# directadjusting::directly_adjusted_estimates library("data.table") set.seed(1337) offsets <- rnorm(8, mean = 1000, sd = 100) baseline <- 100 hrs_by_sex <- rep(1:2, each = 4) hrs_by_ag <- rep(c(0.75, 0.90, 1.10, 1.25), times = 2) counts <- rpois(8, baseline * hrs_by_sex * hrs_by_ag) # raw estimates my_stats <- data.table::data.table( sex = rep(1:2, each = 4), ag = rep(1:4, times = 2), e = counts / offsets, v = counts / (offsets ** 2) ) # adjusted by age group my_adj_stats <- directly_adjusted_estimates( stats_dt = my_stats, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "log", stratum_col_nms = "sex", adjust_col_nms = "ag", weights = c(200, 300, 400, 100) ) # adjusted by smaller age groups, stratified by larger age groups my_stats[, "ag2" := c(1,1, 2,2, 1,1, 2,2)] my_adj_stats <- directly_adjusted_estimates( stats_dt = my_stats, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "log", stratum_col_nms = c("sex", "ag2"), adjust_col_nms = "ag", weights = c(200, 300, 400, 100) ) # with no adjusting columns defined you get the same table as input # but with confidence intervals. this for the sake of # convenience for programming cases where sometimes you want to adjust, # sometimes not. stats_dt_2 <- data.table::data.table( sex = 0:1, e = 0.0, v = 0.1 ) dt_2 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_2, stat_col_nms = "e", var_col_nms = "v", conf_lvls = 0.95, conf_methods = "identity", stratum_col_nms = "sex" ) stopifnot( dt_2[["e"]] == stats_dt_2[["e"]], dt_2[["v"]] == stats_dt_2[["v"]], dt_2[["sex"]] == stats_dt_2[["sex"]] ) # sometimes when adjusting rates or counts, there can be strata where the # statistic is zero. these should be included in your statistics dataset # if you still want the weighted average be influenced by the zero. # otherwise you will get the wrong result. sometimes when naively tabulating # a dataset with e.g. dt[, .N, keyby = "stratum"] one does not get a result # row for a stratum that does not appear in the dataset even if we know that # the stratum exists, for instance only the age groups 1-17 are present in # the dataset. stats_dt_3 <- data.table::data.table( age_group = 1:18, count = 17:0, var = 17:0 ) # this goes as intended dt_3 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_3, stat_col_nms = "count", var_col_nms = "var", stratum_col_nms = NULL, adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) # this does not dt_4 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_3[1:17, ], stat_col_nms = "count", var_col_nms = "var", stratum_col_nms = NULL, adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) # the weighted average that included the zero is smaller stopifnot( dt_3[["count"]] < dt_4[["count"]] ) # NAs are allowed and produce in turn NAs silently. stats_dt_5 <- data.table::data.table( age_group = 1:18, count = c(NA, 16:0), var = c(NA, 16:0) ) dt_5 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_5, stat_col_nms = "count", var_col_nms = "var", adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:18, weight = 18:1 ) ) stopifnot( is.na(dt_5) ) stats_dt_6 <- data.table::data.table( age_group = 1:4, survival = c(0.20, 0.40, 0.60, 0.80), var = 0.05 ^ 2 ) # you can use conf_method to pass whatever to # `delta_method_confidence_intervals`. dt_6 <- directadjusting::directly_adjusted_estimates( stats_dt = stats_dt_6, stat_col_nms = "survival", var_col_nms = "var", adjust_col_nms = "age_group", weights = data.table::data.table( age_group = 1:4, weight = 1:4 ), conf_methods = list( list( g = quote(stats::qnorm(theta)), g_inv = quote(stats::pnorm(g)) ) ) )