Simulate equi-correlated data

Simulate equi-correlated data coming from one or two populations

Usage

gaussianSamples(m, rho, n, pi0, SNR = 1, prob = 1, w = NULL)

Arguments

m: Number of hypotheses
rho: Level of equi-correlation between pairs of variables
n: Number of observations, i.e. sample size
pi0: Proportion of true null hypotheses
SNR: Signal to noise ratio. Either a numeric value (a measure of distance between H0 and H1) or a vector of length m*(1-pi0)
prob: A numeric value in (0, 1], the frequency of the population under H1. If \(prob=1\) (the default), a data set from a single population is generated.
w: An optional vector of length n, the underlying factor driving equi-correlation

Value

A list with three elements:

X: An \(m x n\) matrix of \(m\)-dimensional Gaussian observations, where \(m1\) rows are under H1, and \(m0\) rows are under H0, with \(m0/m = pi0\)
categ: A numeric vector of length \(n\) matching each observation to a sample (in 0 or 1)
H: A vector of length \(m\), the status of each hypothesis: 0 for true null hypothesis, and 1 for true alternative hypothesis

Details

If prob = 1, each of \(m_1\) variables under \(H_1\) has mean \(SNR/\sqrt(n)\). If 0 < p < 1, then \(n_0\) samples are drawn from a \(\mathcal{N}(0,1)\) distribution while \(n_1\) are drawn from a \(\mathcal{N}(\mu, 1)\) distribution, where \(\mu = SNR*\sqrt(1/(n_0) + 1/n_1)\). The argument p is the probability of a sample to belong to the non-zero-mean population.

Author

Gilles Blanchard, Pierre Neuvial and Etienne Roquain

Examples


m <- 23
rho <- 0.2
n <- 100
pi0 <- 0.5
B <- 1e3

## two-sample data
sim <- gaussianSamples(m, rho, n, pi0, SNR = 2, prob = 0.5)
tests <- testByRandomization(sim$X, sim$categ, B = B)

## show test statistics
pch <- 20
colStat <- 1+sim$H
plot(tests$T, col=colStat, main="Test statistics", pch=pch)
legend("topleft", c("H0", "H1"), pch=pch, col=1:2)


## one-sample data
sim <- gaussianSamples(m, rho, n, pi0, SNR=2)
tests <- testByRandomization(sim$X, B = B)

## show test statistics
pch <- 20
colStat <- 1+sim$H
plot(tests$T, col=colStat, main="Test statistics", pch=pch)
legend("topleft", c("H0", "H1"), pch=pch, col=1:2)