Wilcoxon rank sum tests for each row of a matrix

Usage

rowWilcoxonTests(
  mat,
  categ,
  alternative = c("two.sided", "less", "greater"),
  correct = TRUE
)

Arguments

mat: A m x n numeric matrix whose rows correspond to variables and columns to observations
categ: Either a numeric vector of n categories in \(0, 1\) for the observations, or a n x B matrix stacking B such vectors (typically permutations of an original vector of size n)
alternative: A character string specifying the alternative hypothesis. Must be one of "two.sided" (default), "greater" or "less". As in wilcox.test, alternative = "greater" is the alternative that class 1 is shifted to the right of class 0.
correct: A logical indicating whether to apply continuity correction in the normal approximation for the p-value.

Value

A list containing the following components:

statistic: the value of the statistics
p.value: the p-values for the tests

A list containing the following components:

statistic: the value of the statistics
p.value: the p-values for the tests
estimate: the median difference between groups (only calculated if B=1 for computational efficiency)

Each of these elements is a matrix of size m x B, coerced to a vector of length m if B=1

Details

This function performs m x B Wilcoxon T tests on n observations. It is vectorized along the rows of mat. This makes the code much faster than using loops of 'apply' functions, especially for high-dimensional problems (small n and large m) because the overhead of the call to the 'wilcox.test' function is avoided. Note that it is not vectorized along the columns of categ (if any), as a basic 'for' loop is used.

The p-values are computed using the normal approximation as described in the wilcox.test function. The exact p-values (which can be useful for small samples with no ties) are not implemented (yet).

For simplicity, 'estimate' returns the difference between the group medians, which does not match the component 'estimate' output by wilcox.test

Author

Gilles Blanchard, Pierre Neuvial and Etienne Roquain

Examples


p <- 200
n <- 50
mat <- matrix(rnorm(p*n), ncol = n)
cls <- rep(c(0, 1), each = n/2)

stats <- rowWilcoxonTests(mat, categ = cls, alternative = "two.sided")
str(stats)
#> List of 3
#>  $ p.value  : num [1:200] 0.587 0.461 0.985 0.669 0.742 ...
#>  $ statistic: num [1:200] 284 274 314 335 330 340 246 310 331 277 ...
#>  $ estimate : num [1:200] -0.207 -0.1505 0.0411 -0.1154 0.2022 ...

# permutation of class labels
cls_perm <- replicate(11, sample(cls))
stats <- rowWilcoxonTests(mat, categ = cls_perm, alternative = "two.sided")
str(stats)
#> List of 3
#>  $ p.value  : num [1:200, 1:11] 0.8159 0.0808 0.0625 0.9536 0.2143 ...
#>  $ statistic: num [1:200, 1:11] 300 403 409 316 377 290 386 307 393 228 ...
#>  $ estimate : num [1:200, 1:11] NA NA NA NA NA NA NA NA NA NA ...

# several unrelated contrasts
cls2 <- cls
cls[1:10] <- 1 # varying nx, ny
cls_mat <- cbind(cls, cls2)
stats <- rowWilcoxonTests(mat, categ = cls_mat, alternative = "two.sided")
str(stats)
#> List of 3
#>  $ p.value  : num [1:200, 1:2] 0.512 0.409 0.767 0.108 0.799 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:2] "cls" "cls2"
#>  $ statistic: num [1:200, 1:2] 231 223 277 339 250 318 238 292 231 228 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:2] "cls" "cls2"
#>  $ estimate : num [1:200, 1:2] NA NA NA NA NA NA NA NA NA NA ...