Skip to contents

Simulate equi-correlated data coming from one or two populations

Usage

gaussianSamples(m, rho, n, pi0, SNR = 1, prob = 1, w = NULL)

Arguments

m

Number of hypotheses

rho

Level of equi-correlation between pairs of variables

n

Number of observations, i.e. sample size

pi0

Proportion of true null hypotheses

SNR

Signal to noise ratio. Either a numeric value (a measure of distance between H0 and H1) or a vector of length m*(1-pi0)

prob

A numeric value in (0, 1], the frequency of the population under H1. If \(prob=1\) (the default), a data set from a single population is generated.

w

An optional vector of length n, the underlying factor driving equi-correlation

Value

A list with three elements:

X

An \(m x n\) matrix of \(m\)-dimensional Gaussian observations, where \(m1\) rows are under H1, and \(m0\) rows are under H0, with \(m0/m = pi0\)

categ

A numeric vector of length \(n\) matching each observation to a sample (in 0 or 1)

H

A vector of length \(m\), the status of each hypothesis: 0 for true null hypothesis, and 1 for true alternative hypothesis

.

Details

If prob = 1, each of \(m_1\) variables under \(H_1\) has mean \(SNR/\sqrt(n)\). If 0 < p < 1, then \(n_0\) samples are drawn from a \(\mathcal{N}(0,1)\) distribution while \(n_1\) are drawn from a \(\mathcal{N}(\mu, 1)\) distribution, where \(\mu = SNR*\sqrt(1/(n_0) + 1/n_1)\). The argument p is the probability of a sample to belong to the non-zero-mean population.

See also

Author

Gilles Blanchard, Pierre Neuvial and Etienne Roquain

Examples


m <- 23
rho <- 0.2
n <- 100
pi0 <- 0.5
B <- 1e3

## two-sample data
sim <- gaussianSamples(m, rho, n, pi0, SNR = 2, prob = 0.5)
tests <- testByRandomization(sim$X, sim$categ, B = B)

## show test statistics
pch <- 20
colStat <- 1+sim$H
plot(tests$T, col=colStat, main="Test statistics", pch=pch)
legend("topleft", c("H0", "H1"), pch=pch, col=1:2)


## one-sample data
sim <- gaussianSamples(m, rho, n, pi0, SNR=2)
tests <- testByRandomization(sim$X, B = B)

## show test statistics
pch <- 20
colStat <- 1+sim$H
plot(tests$T, col=colStat, main="Test statistics", pch=pch)
legend("topleft", c("H0", "H1"), pch=pch, col=1:2)