Skip to contents

Welch T-tests for rows of a matrix

Usage

rowWelchTests(X, categ, alternative = c("two.sided", "less", "greater"))

Arguments

X

A m x n numeric matrix whose rows correspond to variables and columns to observations

categ

Either a numeric vector of n categories in \(0, 1\) for the observations, or a n x B matrix stacking B such vectors (typically permutations of an original vector of size n)

alternative

A character string specifying the alternative hypothesis. Must be one of "two.sided" (default), "greater" or "less". As in t.test, alternative = "greater" is the alternative that class 1 has a larger mean than class 0.

Value

A list containing the following components:

statistic

the value of the t-statistics

parameter

the degrees of freedom for the t-statistics

p.value

the p-values for the tests

estimate

the mean difference between groups

Each of these elements is a matrix of size m x B, coerced to a vector of length m if B=1

Details

This function performs m x B Welch T tests on n observations using matrix operations. Its time complexity is O(mBn). The code is much faster than using loops of 'apply' functions, especially for high-dimensional problems (small n and large m) because the overhead of the call to the 't.test' function is avoided and the code is vectorized

References

B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, 38, 330-336

Author

Pierre Neuvial

Examples


m <- 300
n <- 38
mat <- matrix(rnorm(m * n), ncol = n)
categ <- rep(c(0, 1), times = c(27, n - 27))
system.time(fwt <- rowWelchTests(mat, categ, alternative = "greater"))
#>    user  system elapsed 
#>   0.003   0.001   0.001 
str(fwt)
#> List of 4
#>  $ statistic: num [1:300] 0.0104 1.6174 -0.1244 0.9773 0.9468 ...
#>  $ parameter: num [1:300] 15.1 23.1 22.6 21.4 25.4 ...
#>  $ p.value  : num [1:300] 0.4959 0.0597 0.5489 0.1697 0.1763 ...
#>  $ estimate : num [1:300] 0.00387 0.577 -0.04127 0.35439 0.26165 ...

# compare with ordinary t.test:
system.time(pwt <- apply(mat, 1, FUN = function(x) {
  t.test(x[categ == 1], x[categ == 0], alternative = "greater")$p.value
}))
#>    user  system elapsed 
#>   0.035   0.000   0.035 
all(abs(fwt$p.value - pwt) < 1e-10) ## same results
#> [1] TRUE

# with several contrasts/permutations
B <- 100
categ_perm <- replicate(B, sample(categ))
system.time(fwt_perm <- rowWelchTests(mat, categ_perm, alternative = "greater"))
#>    user  system elapsed 
#>   0.024   0.028   0.013 
str(fwt_perm)
#> List of 4
#>  $ statistic: num [1:300, 1:100] 0.251 0.662 -0.916 1.41 0.188 ...
#>  $ parameter: num [1:300, 1:100] 22.8 31.6 26.5 17.8 22.1 ...
#>  $ p.value  : num [1:300, 1:100] 0.402 0.2565 0.8159 0.0879 0.4265 ...
#>  $ estimate : num [1:300, 1:100] 0.0764 0.213 -0.2829 0.5479 0.0553 ...