Welch T-tests for rows of a matrix

rowWelchTests(X, categ, alternative = c("two.sided", "less", "greater"))



A m x n numeric matrix whose rows correspond to variables and columns to observations


Either a numeric vector of n categories in \(0, 1\) for the observations, or a n x B matrix stacking B such vectors (typically permutations of an original vector of size n)


A character string specifying the alternative hypothesis. Must be one of "two.sided" (default), "greater" or "less". As in t.test, alternative = "greater" is the alternative that class 1 has a larger mean than class 0.


A list containing the following components:


the value of the t-statistics


the degrees of freedom for the t-statistics


the p-values for the tests


the mean difference between groups

Each of these elements is a matrix of size m x B, coerced to a vector of length m if B=1


This function performs m x B Welch T tests on n observations using matrix operations. Its time complexity is O(mBn). The code is much faster than using loops of 'apply' functions, especially for high-dimensional problems (small n and large m) because the overhead of the call to the 't.test' function is avoided and the code is vectorized


B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, 38, 330-336


Pierre Neuvial


m <- 300
n <- 38
mat <- matrix(rnorm(m * n), ncol = n)
categ <- rep(c(0, 1), times = c(27, n - 27))
system.time(fwt <- rowWelchTests(mat, categ, alternative = "greater"))
#>    user  system elapsed 
#>   0.002   0.000   0.002 
#> List of 4
#>  $ statistic: num [1:300] 0.0104 1.6174 -0.1244 0.9773 0.9468 ...
#>  $ parameter: num [1:300] 15.1 23.1 22.6 21.4 25.4 ...
#>  $ p.value  : num [1:300] 0.4959 0.0597 0.5489 0.1697 0.1763 ...
#>  $ estimate : num [1:300] 0.00387 0.577 -0.04127 0.35439 0.26165 ...

# compare with ordinary t.test:
system.time(pwt <- apply(mat, 1, FUN = function(x) {
  t.test(x[categ == 1], x[categ == 0], alternative = "greater")$p.value
#>    user  system elapsed 
#>   0.066   0.000   0.066 
all(abs(fwt$p.value - pwt) < 1e-10) ## same results
#> [1] TRUE

# with several contrasts/permutations
B <- 100
categ_perm <- replicate(B, sample(categ))
system.time(fwt_perm <- rowWelchTests(mat, categ_perm, alternative = "greater"))
#>    user  system elapsed 
#>    0.02    0.00    0.02 
#> List of 4
#>  $ statistic: num [1:300, 1:100] 0.251 0.662 -0.916 1.41 0.188 ...
#>  $ parameter: num [1:300, 1:100] 22.8 31.6 26.5 17.8 22.1 ...
#>  $ p.value  : num [1:300, 1:100] 0.402 0.2565 0.8159 0.0879 0.4265 ...
#>  $ estimate : num [1:300, 1:100] 0.0764 0.213 -0.2829 0.5479 0.0553 ...