Welch T-tests for rows of a matrix

Usage

rowWelchTests(X, categ, alternative = c("two.sided", "less", "greater"))

Arguments

X: A m x n numeric matrix whose rows correspond to variables and columns to observations
categ: Either a numeric vector of n categories in \(0, 1\) for the observations, or a n x B matrix stacking B such vectors (typically permutations of an original vector of size n)
alternative: A character string specifying the alternative hypothesis. Must be one of "two.sided" (default), "greater" or "less". As in t.test, alternative = "greater" is the alternative that class 1 has a larger mean than class 0.

Value

A list containing the following components:

statistic: the value of the t-statistics
parameter: the degrees of freedom for the t-statistics
p.value: the p-values for the tests
estimate: the mean difference between groups

Each of these elements is a matrix of size m x B, coerced to a vector of length m if B=1

Details

This function performs m x B Welch T tests on n observations using matrix operations. Its time complexity is O(mBn). The code is much faster than using loops of 'apply' functions, especially for high-dimensional problems (small n and large m) because the overhead of the call to the 't.test' function is avoided and the code is vectorized

References

B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, 38, 330-336

Author

Pierre Neuvial

Examples


m <- 300
n <- 38
mat <- matrix(rnorm(m * n), ncol = n)
categ <- rep(c(0, 1), times = c(27, n - 27))
system.time(fwt <- rowWelchTests(mat, categ, alternative = "greater"))
#>    user  system elapsed 
#>   0.003   0.002   0.002 
str(fwt)
#> List of 4
#>  $ statistic: num [1:300] 0.579 -0.94 0.563 -1.242 3.333 ...
#>  $ parameter: num [1:300] 20.9 22.3 16.4 22.5 21.9 ...
#>  $ p.value  : num [1:300] 0.28442 0.82143 0.29044 0.88648 0.00152 ...
#>  $ estimate : num [1:300] 0.184 -0.296 0.149 -0.421 1.054 ...

# compare with ordinary t.test:
system.time(pwt <- apply(mat, 1, FUN = function(x) {
  t.test(x[categ == 1], x[categ == 0], alternative = "greater")$p.value
}))
#>    user  system elapsed 
#>   0.036   0.000   0.036 
all(abs(fwt$p.value - pwt) < 1e-10) ## same results
#> [1] TRUE

# with several contrasts/permutations
B <- 100
categ_perm <- replicate(B, sample(categ))
system.time(fwt_perm <- rowWelchTests(mat, categ_perm, alternative = "greater"))
#>    user  system elapsed 
#>   0.018   0.034   0.014 
str(fwt_perm)
#> List of 4
#>  $ statistic: num [1:300, 1:100] -0.95 -1.652 0.495 -0.653 -1.585 ...
#>  $ parameter: num [1:300, 1:100] 17.9 20.3 19.8 13.8 18 ...
#>  $ p.value  : num [1:300, 1:100] 0.823 0.943 0.313 0.738 0.935 ...
#>  $ estimate : num [1:300, 1:100] -0.322 -0.528 0.12 -0.288 -0.594 ...