Welch T-tests for rows of a matrix
rowWelchTests(X, categ, alternative = c("two.sided", "less", "greater"))
A m x n
numeric matrix whose rows correspond to variables
and columns to observations
Either a numeric vector of n
categories in \(0, 1\) for
the observations, or a n x B
matrix stacking B
such vectors
(typically permutations of an original vector of size n
)
A character string specifying the alternative hypothesis.
Must be one of "two.sided" (default), "greater" or "less". As in
t.test
, alternative = "greater" is the alternative that class
1 has a larger mean than class 0.
A list containing the following components:
the value of the t-statistics
the degrees of freedom for the t-statistics
the p-values for the tests
the mean difference between groups
Each of these elements is a matrix of size m x B
, coerced to a vector of length m
if B=1
This function performs m x B
Welch T tests on
n
observations using matrix operations. Its time complexity is
O(mBn)
. The code is much faster than using loops of 'apply'
functions, especially for high-dimensional problems (small n and large m)
because the overhead of the call to the 't.test' function is avoided and
the code is vectorized
B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, 38, 330-336
m <- 300
n <- 38
mat <- matrix(rnorm(m * n), ncol = n)
categ <- rep(c(0, 1), times = c(27, n - 27))
system.time(fwt <- rowWelchTests(mat, categ, alternative = "greater"))
#> user system elapsed
#> 0.002 0.000 0.002
str(fwt)
#> List of 4
#> $ statistic: num [1:300] 0.0104 1.6174 -0.1244 0.9773 0.9468 ...
#> $ parameter: num [1:300] 15.1 23.1 22.6 21.4 25.4 ...
#> $ p.value : num [1:300] 0.4959 0.0597 0.5489 0.1697 0.1763 ...
#> $ estimate : num [1:300] 0.00387 0.577 -0.04127 0.35439 0.26165 ...
# compare with ordinary t.test:
system.time(pwt <- apply(mat, 1, FUN = function(x) {
t.test(x[categ == 1], x[categ == 0], alternative = "greater")$p.value
}))
#> user system elapsed
#> 0.066 0.000 0.066
all(abs(fwt$p.value - pwt) < 1e-10) ## same results
#> [1] TRUE
# with several contrasts/permutations
B <- 100
categ_perm <- replicate(B, sample(categ))
system.time(fwt_perm <- rowWelchTests(mat, categ_perm, alternative = "greater"))
#> user system elapsed
#> 0.02 0.00 0.02
str(fwt_perm)
#> List of 4
#> $ statistic: num [1:300, 1:100] 0.251 0.662 -0.916 1.41 0.188 ...
#> $ parameter: num [1:300, 1:100] 22.8 31.6 26.5 17.8 22.1 ...
#> $ p.value : num [1:300, 1:100] 0.402 0.2565 0.8159 0.0879 0.4265 ...
#> $ estimate : num [1:300, 1:100] 0.0764 0.213 -0.2829 0.5479 0.0553 ...