This vignettes illustrates the following points

  • simulation of one- and two-sample Gaussian equi-correlated observations
  • computation of test statistics by randomization
  • calibration of Joint-Family-Wise Error Rate (JER) thresholds

We start with two-sample tests because we believe they are used more frequently.

library("sanssouci")
#set.seed(0xBEEF) # for reproducibility
library("ggplot2")

Parameters:

m <- 5e2
n <- 30
pi0 <- 0.8
rho <- 0.3
SNR <- 3

We use the function SansSouciSim to generate Gaussian equi-correlated samples. We then ‘fit’ the resulting SansSouci object

Two-sample tests

Simulation

obj <- SansSouciSim(m, rho, n, pi0, SNR = SNR, prob = 0.5)
obj
## 'SansSouci' object:
##  Number of hypotheses:    500 
##  Number of observations:  30 
##  2-sample data
## 
## Truth: 
##   100 false null hypotheses (signals) out of 500 (pi0=0.8)

We perform JER calibration using the linear (Simes) template \(\mathfrak{R(\lambda)}=(R_1(\lambda),\) \(\ldots,R_K(\lambda)\), where for \(1 \leq k \leq K\)

\[R_k(\lambda) = \left\{i\in \{1, \dots , m\}\::\: \bar{\Phi}(Z_i) > \frac{\lambda k}{m} \right\}\,\]

where \(\bar{\Phi}\) is the cdf of the \(\mathcal{N}(0,1)\) distribution. Note that \(p_i = \bar{\Phi}(Z_i)\) is the one-sided \(p\)-value associated to the test statistics \(Z_i\).

Calibration

B <- 1000
alpha <- 0.2
cal <- fit(obj, alpha = alpha, B = B, family = "Simes")
cal
## 'SansSouci' object:
##  Number of hypotheses:    500 
##  Number of observations:  30 
##  2-sample data
## 
## Truth: 
##   100 false null hypotheses (signals) out of 500 (pi0=0.8)
## Parameters: 
##  Test function:      rowWelchTests
##  Number of permutations: B=1000
##  Significance level: alpha=0.2
##  Reference family:   Simes
##      (of size:   K=500)
## 
## Output:
##  Calibration parameter:  lambda=0.269668

The output of the calibration is the following SansSouci object. In particular:

  • cal$input contains the input simulated data (and the associated truth)
  • cal$param contains the calibration parameters
  • cal$output contains the output of the calibration, including
    • p.values: \(m\) test statistics calculated by permutation
    • thr : A JER-controlling family of \(K\) (here \(K=m\)) elements
    • lambda: the \(\lambda\)-calibration parameter

Because we are under positive equi-correlation, we expect \(\lambda > \alpha\) for the Simes family.

cal$output$lambda
##      20% 
## 0.269668
cal$output$lambda > alpha
##  20% 
## TRUE

Post hoc confidence bounds

The fitted SansSouci object contains post hoc confidence bounds:

env <- predict(cal, all = TRUE)
head(env)
##   x label stat bound
## 1 1 Simes   TP     1
## 2 2 Simes   TP     2
## 3 3 Simes   TP     3
## 4 4 Simes   TP     4
## 5 5 Simes   TP     5
## 6 6 Simes   TP     6

We compare it to the true number of false positives among the most significant items, and to the Simes bound without calibration

cal0 <- fit(obj, alpha = alpha, B = 0, family = "Simes")
oracle <- fit(obj, alpha = alpha, family = "Oracle")

confs <- list(Simes = predict(cal0, all = TRUE),
              "Simes+calibration" = predict(cal, all = TRUE),
              "Oracle" = predict(oracle, all = TRUE))
plotConfCurve(confs, xmax = 200)

One sample tests

The code is identical, except for the line to generate the observations (where we do not specify a probability of belonging to one of the two populations using the prob argument); moreover it is not necessary to specify a vector of categories ‘categ’ in ‘calibrateJER’.

obj <- SansSouciSim(m, rho, n, pi0, SNR = SNR)
obj
## 'SansSouci' object:
##  Number of hypotheses:    500 
##  Number of observations:  30 
##  1-sample data
## 
## Truth: 
##   100 false null hypotheses (signals) out of 500 (pi0=0.8)
cal <- fit(obj, alpha = alpha, B = B, family = "Simes")
cal
## 'SansSouci' object:
##  Number of hypotheses:    500 
##  Number of observations:  30 
##  1-sample data
## 
## Truth: 
##   100 false null hypotheses (signals) out of 500 (pi0=0.8)
## Parameters: 
##  Test function:      rowZTests
##  Number of permutations: B=1000
##  Significance level: alpha=0.2
##  Reference family:   Simes
##      (of size:   K=500)
## 
## Output:
##  Calibration parameter:  lambda=0.2749622

Again we expect \(\lambda > \alpha\).

Confidence curves

The associated confidence curves are displayed below:

cal0 <- fit(obj, alpha = alpha, B = 0, family = "Simes")
oracle <- fit(obj, alpha = alpha, family = "Oracle")

confs <- list(Simes = predict(cal0, all = TRUE),
              "Simes+calibration" = predict(cal, all = TRUE),
              "Oracle" = predict(oracle, all = TRUE))

Session information

## R version 4.2.2 (2022-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.4.0    sanssouci_0.12.8
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.9          highr_0.9           pillar_1.8.1       
##  [4] bslib_0.4.1         compiler_4.2.2      jquerylib_0.1.4    
##  [7] tools_4.2.2         digest_0.6.30       tibble_3.1.8       
## [10] jsonlite_1.8.3      evaluate_0.18       memoise_2.0.1      
## [13] lifecycle_1.0.3     gtable_0.3.1        lattice_0.20-45    
## [16] pkgconfig_2.0.3     rlang_1.0.6         Matrix_1.5-3       
## [19] cli_3.4.1           yaml_2.3.6          pkgdown_2.0.6      
## [22] xfun_0.34           fastmap_1.1.0       withr_2.5.0        
## [25] stringr_1.4.1       knitr_1.40          vctrs_0.5.0        
## [28] desc_1.4.2          generics_0.1.3      fs_1.5.2           
## [31] sass_0.4.2          systemfonts_1.0.4   rprojroot_2.0.3    
## [34] grid_4.2.2          glue_1.6.2          R6_2.5.1           
## [37] textshaping_0.3.6   fansi_1.0.3         rmarkdown_2.18     
## [40] farver_2.1.1        purrr_0.3.5         magrittr_2.0.3     
## [43] scales_1.2.1        htmltools_0.5.3     matrixStats_0.62.0 
## [46] matrixTests_0.1.9.1 colorspace_2.0-3    labeling_0.4.2     
## [49] ragg_1.2.4          utf8_1.2.2          stringi_1.7.8      
## [52] munsell_0.5.0       cachem_1.0.6