Estimate of the proportion of true nulls in each node of a tree

Takes a forest structure as input, given by the couple C, leaf_list and returns the corresponding \(\zeta_k\)'s according to the method(s) chosen.

Usage

zetas.tree(
  C,
  leaf_list,
  method,
  pvalues,
  alpha,
  refine = FALSE,
  verbose = FALSE,
  ...
)

Arguments

C: A list of list representing the forest structure. See V.star() for more information.
leaf_list: A list of vectors representing the atoms of the forest structure. See V.star() for more information.
method: A function with arguments (pval, lambda) that can compute an upper bound on the false positives in the region associated to the \(p\)-values pval at confidence level 1 - lambda. It can also be a list of such functions, where the h-th function is used at depth h in the tree structure, that is on the \(R_k\)'s represented by the elements found in C[[h]]. Finally, it can also be a list of lists of such functions, mimicking the structure of C itself, that is, method[[h]][[j]] is applied the \(R_k\) represented by C[[h]][[j]].
pvalues: A vector of \(p\)-values, must be of size m, with m the highest element found in the vectors of leaf_list.
alpha: A target error level in \(]0,1[]\).
refine: A boolean, FALSE by default. Whether to use the step-down refinement to try to produce smaller \(\zeta_k\)'s, see Details.
verbose: A boolean, FALSE by default. Whether to print information about the (possibly multiple) round(s) of step-down refinement.
...: Additional arguments that may be passed to specific zeta functions.

Value

ZL: A list of integer vectors representing the upper bounds \(\zeta_k\) of the forest structure. See V.star() for more information.

Details

The proportion of true nulls in each node is estimated by an union bound on the regions. That is, the provided method(s) is/are applied at level \(\frac{\alpha}{K}\) where \(K\) is the number of regions. In the step-down refinement, if we find a \(R_k\) with associated \(\zeta_k=0\), that is, we think that the region contains only false null hypotheses, we can remove it and run again the \(\zeta_k\)'s computation using \(K-1\) instead of \(K\) in the union bound, and so on until we don't reduce the "effective" number of regions.

References

Durand, G., Blanchard, G., Neuvial, P., & Roquain, E. (2020). Post hoc false positive control for structured hypotheses. Scandinavian Journal of Statistics, 47(4), 1114-1148.

Durand, G. (2018). Multiple testing and post hoc bounds for heterogeneous data. PhD thesis, see Appendix B.2 for the step-down refinement.

Durand G. (2025). A fast algorithm to compute a curve of confidence upper bounds for the False Discovery Proportion using a reference family with a forest structure. arXiv:2502.03849.

Examples

m <- 1000
dd <- dyadic.from.window.size(m, s = 10, method = 2)
leaf_list <- dd$leaf_list
pvalues <- runif(m)
C <- dd$C
method <- zeta.trivial
ZL <- zetas.tree(C, leaf_list, method, pvalues, alpha = 0.05)
ZL
#> [[1]]
#> [1] 1000
#> 
#> [[2]]
#> [1] 500 500
#> 
#> [[3]]
#> [1] 250 250 250 250
#> 
#> [[4]]
#> [1] 130 120 130 120 130 120 130 120
#> 
#> [[5]]
#>  [1] 70 60 60 60 70 60 60 60 70 60 60 60 70 60 60 60
#> 
#> [[6]]
#>  [1] 40 30 30 30 30 30 30 30 40 30 30 30 30 30 30 30 40 30 30 30 30 30 30 30 40
#> [26] 30 30 30 30 30 30 30
#> 
#> [[7]]
#>  [1] 20 20 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 20 20 10 20 10 20 10 20
#> [26] 10 20 10 20 10 20 10 20 20 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 20
#> [51] 20 10 20 10 20 10 20 10 20 10 20 10 20 10
#> 
#> [[8]]
#>  [1] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#> [26] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#> [51] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#>