Skip to contents

Computes the post hoc upper bound \(V^*(S_t)\) on the number of false positives in a given sequence of selection sets \(S_t\) of hypotheses, such that \(S_t\subset S_{t+1}\) and \(|S_t| = t\), using a reference family \((R_k, \zeta_k)\) that possess the forest structure (see References).

Usage

curve.V.star.forest.naive(
  perm,
  C,
  ZL,
  leaf_list,
  pruning = FALSE,
  delete.gaps = FALSE
)

curve.V.star.forest.fast(
  perm,
  C,
  ZL,
  leaf_list,
  pruning = FALSE,
  is.pruned = FALSE,
  is.complete = FALSE,
  delete.gaps = FALSE
)

Arguments

perm

An integer vector of elements in 1:m, all different, and of size up to m (in which case it's a permutation, hence the name). The set \(S_t\) is represented by perm[1:t].

C

A list of list representing the forest structure. See V.star() for more information.

ZL

A list of integer vectors representing the upper bounds \(\zeta_k\) of the forest structure. See V.star() for more information.

leaf_list

A list of vectors representing the atoms of the forest structure. See V.star() for more information.

pruning

A boolean, FALSE by default. Whether to prune the forest (see pruning()) before computing the bounds. Ignored if is.pruned is TRUE.

delete.gaps

A boolean, FALSE by default. If TRUE, will also delete the gaps in the structure induced by the pruning, see delete.gaps(). Ignored if pruning is FALSE.

is.pruned

A boolean, FALSE by default. If TRUE, assumes that the forest structure has already been completed (see forest.completion()) and then pruned (see pruning()) and so skips the completion step and optional pruning step. Must be set to TRUE if giving a pruned forest, see Details.

is.complete

A boolean, FALSE by default. If TRUE, assumes that the forest structure has already been completed (see forest.completion()) and so skips the completion step. Ignored if is.pruned is TRUE.

Value

A vector of length of same length as perm, where the t-th element is \(V^*(S_t)\).

Details

Two functions are available

curve.V.star.forest.naive

Repeatedly calls V.star() on each \(S_t\), which is not optimized and time-consuming, this should be used in practice.

curve.V.star.forest.fast

A fast and optimized version that leverage the fact that\(S_{t+1}\) is the union of \(S_t\) and a single hypothesis index. The algorithm needs to work on a complete forest, so this version first completes the forest (unless told that the forest has already been completed, see forest.completion()), and the completion fails if the input is a pruned forest (see pruning()), so if a pruned forest is given as input, it MUST be said with the is.pruned argument so that the function skips completion (so the pruned forest given as input must also be complete).

References

Durand, G., Blanchard, G., Neuvial, P., & Roquain, E. (2020). Post hoc false positive control for structured hypotheses. Scandinavian Journal of Statistics, 47(4), 1114-1148.

Durand G., preprint to appear with the description of pruning and of the fast algorithm to compute the curve.

Examples

m <- 20
C <- list(
  list(c(2, 5), c(8, 15), c(16, 19)),
  list(c(3, 5), c(8, 10), c(12, 15), c(16, 16), c(17, 19)),
  list(c(4, 5), c(8, 9), c(10, 10), c(12, 12), c(13, 15), c(17, 17), c(18, 19)),
  list(c(8, 8), c(9, 9), c(13, 13), c(14, 15), c(18, 18), c(19, 19))
)
ZL <- list(
  c(4, 8, 4),
  c(3, 3, 4, 1, 3),
  c(2, 2, 1, 1, 2, 1, 2),
  c(1, 1, 1, 2, 1, 1)
)
leaf_list <- as.list(1:m)
curve.V.star.forest.naive(1:m, C, ZL, leaf_list, pruning = FALSE)
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 14 15 16 17 18 19

curve.V.star.forest.naive(1:m, C, ZL, leaf_list, pruning = TRUE)
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 14 15 16 17 18 19

curve.V.star.forest.fast(1:m, C, ZL, leaf_list, pruning = FALSE)
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 14 15 16 17 18 19

curve.V.star.forest.fast(1:m, C, ZL, leaf_list, pruning = TRUE)
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 14 15 16 17 18 19