Compute a curve of post hoc bounds based on a reference family with forest structure
Source:R/tree-functions.R
curve.V.star.forest.Rd
Computes the post hoc upper bound \(V^*(S_t)\) on the number of false positives in a given sequence of selection sets \(S_t\) of hypotheses, such that \(S_t\subset S_{t+1}\) and \(|S_t| = t\), using a reference family \((R_k, \zeta_k)\) that possess the forest structure (see References).
Usage
curve.V.star.forest.naive(
perm,
C,
ZL,
leaf_list,
pruning = FALSE,
delete.gaps = FALSE
)
curve.V.star.forest.fast(
perm,
C,
ZL,
leaf_list,
pruning = FALSE,
is.pruned = FALSE,
is.complete = FALSE,
delete.gaps = FALSE
)
Arguments
- perm
An integer vector of elements in
1:m
, all different, and of size up tom
(in which case it's a permutation, hence the name). The set \(S_t\) is represented byperm[1:t]
.- C
A list of list representing the forest structure. See
V.star()
for more information.- ZL
A list of integer vectors representing the upper bounds \(\zeta_k\) of the forest structure. See
V.star()
for more information.- leaf_list
A list of vectors representing the atoms of the forest structure. See
V.star()
for more information.- pruning
A boolean,
FALSE
by default. Whether to prune the forest (seepruning()
) before computing the bounds. Ignored ifis.pruned
isTRUE
.- delete.gaps
A boolean,
FALSE
by default. IfTRUE
, will also delete the gaps in the structure induced by the pruning, seedelete.gaps()
. Ignored ifpruning
isFALSE
.- is.pruned
A boolean,
FALSE
by default. IfTRUE
, assumes that the forest structure has already been completed (seeforest.completion()
) and then pruned (seepruning()
) and so skips the completion step and optional pruning step. Must be set toTRUE
if giving a pruned forest, see Details.- is.complete
A boolean,
FALSE
by default. IfTRUE
, assumes that the forest structure has already been completed (seeforest.completion()
) and so skips the completion step. Ignored ifis.pruned
isTRUE
.
Details
Two functions are available
curve.V.star.forest.naive
Repeatedly calls
V.star()
on each \(S_t\), which is not optimized and time-consuming, this should be used in practice.curve.V.star.forest.fast
A fast and optimized version that leverage the fact that\(S_{t+1}\) is the union of \(S_t\) and a single hypothesis index. The algorithm needs to work on a complete forest, so this version first completes the forest (unless told that the forest has already been completed, see
forest.completion()
), and the completion fails if the input is a pruned forest (seepruning()
), so if a pruned forest is given as input, it MUST be said with theis.pruned
argument so that the function skips completion (so the pruned forest given as input must also be complete).
References
Durand, G., Blanchard, G., Neuvial, P., & Roquain, E. (2020). Post hoc false positive control for structured hypotheses. Scandinavian Journal of Statistics, 47(4), 1114-1148.
Durand G., preprint to appear with the description of pruning and of the fast algorithm to compute the curve.
Examples
m <- 20
C <- list(
list(c(2, 5), c(8, 15), c(16, 19)),
list(c(3, 5), c(8, 10), c(12, 15), c(16, 16), c(17, 19)),
list(c(4, 5), c(8, 9), c(10, 10), c(12, 12), c(13, 15), c(17, 17), c(18, 19)),
list(c(8, 8), c(9, 9), c(13, 13), c(14, 15), c(18, 18), c(19, 19))
)
ZL <- list(
c(4, 8, 4),
c(3, 3, 4, 1, 3),
c(2, 2, 1, 1, 2, 1, 2),
c(1, 1, 1, 2, 1, 1)
)
leaf_list <- as.list(1:m)
curve.V.star.forest.naive(1:m, C, ZL, leaf_list, pruning = FALSE)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 14 15 16 17 18 19
curve.V.star.forest.naive(1:m, C, ZL, leaf_list, pruning = TRUE)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 14 15 16 17 18 19
curve.V.star.forest.fast(1:m, C, ZL, leaf_list, pruning = FALSE)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 14 15 16 17 18 19
curve.V.star.forest.fast(1:m, C, ZL, leaf_list, pruning = TRUE)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 14 15 16 17 18 19