Estimate of the proportion of true nulls in each node of a tree
Source:R/tree-functions.R
zetas.tree.Rd
Takes a forest structure as input, given by the couple C
, leaf_list
and returns the corresponding \(\zeta_k\)'s according to the method(s) chosen.
Arguments
- C
A list of list representing the forest structure. See
V.star()
for more information.- leaf_list
A list of vectors representing the atoms of the forest structure. See
V.star()
for more information.- method
A function with arguments
(pval, lambda)
that can compute an upper bound on the false positives in the region associated to the \(p\)-valuespval
at confidence level1 - lambda
. It can also be a list of such functions, where theh
-th function is used at depthh
in the tree structure, that is on the \(R_k\)'s represented by the elements found inC[[h]]
. Finally, it can also be a list of lists of such functions, mimicking the structure ofC
itself, that is,method[[h]][[j]]
is applied the \(R_k\) represented byC[[h]][[j]]
.- pvalues
A vector of \(p\)-values, must be of size
m
, withm
the highest element found in the vectors ofleaf_list
.- alpha
A target error level in \(]0,1[]\).
- refine
A boolean,
FALSE
by default. Whether to use the step-down refinement to try to produce smaller \(\zeta_k\)'s, see Details.- verbose
A boolean,
FALSE
by default. Whether to print information about the (possibly multiple) round(s) of step-down refinement.
Value
ZL
: A list of integer vectors representing the upper bounds \(\zeta_k\) of the forest structure. See V.star()
for more information.
Details
The proportion of true nulls in each node is estimated by an union bound on the regions. That is, the provided method(s) is/are applied at level \(\frac{\alpha}{K}\) where \(K\) is the number of regions. In the step-down refinement, if we find a \(R_k\) with associated \(\zeta_k=0\), that is, we think that the region contains only false null hypotheses, we can remove it and run again the \(\zeta_k\)'s computation using \(K-1\) instead of \(K\) in the union bound, and so on until we don't reduce the "effective" number of regions.
References
Durand, G., Blanchard, G., Neuvial, P., & Roquain, E. (2020). Post hoc false positive control for structured hypotheses. Scandinavian Journal of Statistics, 47(4), 1114-1148.
Durand, G. (2018). Multiple testing and post hoc bounds for heterogeneous data. PhD thesis, see Appendix B.2 for the step-down refinement.
Examples
m <- 1000
dd <- dyadic.from.window.size(m, s = 10, method = 2)
leaf_list <- dd$leaf_list
pvalues <- runif(m)
C <- dd$C
method <- zeta.trivial
ZL <- zetas.tree(C, leaf_list, method, pvalues, alpha = 0.05)
ZL
#> [[1]]
#> [1] 1000
#>
#> [[2]]
#> [1] 500 500
#>
#> [[3]]
#> [1] 250 250 250 250
#>
#> [[4]]
#> [1] 130 120 130 120 130 120 130 120
#>
#> [[5]]
#> [1] 70 60 60 60 70 60 60 60 70 60 60 60 70 60 60 60
#>
#> [[6]]
#> [1] 40 30 30 30 30 30 30 30 40 30 30 30 30 30 30 30 40 30 30 30 30 30 30 30 40
#> [26] 30 30 30 30 30 30 30
#>
#> [[7]]
#> [1] 20 20 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 20 20 10 20 10 20 10 20
#> [26] 10 20 10 20 10 20 10 20 20 20 10 20 10 20 10 20 10 20 10 20 10 20 10 20 20
#> [51] 20 10 20 10 20 10 20 10 20 10 20 10 20 10
#>
#> [[8]]
#> [1] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#> [26] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#> [51] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#> [76] 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
#>