R Synatax Issues and Bugs

A collection of detected R inconsistencies and bugs.

Format Deviations

Unexpected deviations of type or dimensions.

0 dimension matrix

Subsetting a matrix by selecting 0 rows and 0 columns preserves matrix format even without drop=FALSE.

m <- matrix(1:9, nrow=3, ncol=3)

m[1,0]
## integer(0)

m[0,0]
## <0 x 0 matrix>

qnorm on empty matrix

qnorm() returns an empty vector when input is an empty matrix.

qnorm(matrix(0.1, nrow=1, ncol=2))
##           [,1]      [,2]
## [1,] -1.281552 -1.281552

qnorm(matrix(0, nrow=0, ncol=0))
## numeric(0)

is.nan on data.frame

is.nan doesn’t work on data.frames while is.na does.

is.na(iris)
## [works]

is.nan(iris)
## Error in is.nan(iris) : default method not implemented for type 'list'

rbind on zero-column data.frames

rbind() on matrices works as expected:

mat1 <- matrix(nrow=2, ncol=0)
mat2 <- matrix(nrow=2, ncol=0)

dim(rbind(mat1, mat2))
## [1] 4 0

But the same operation on data.frames looses the number of rows:

dim(rbind(as.data.frame(mat1), as.data.frame(mat2)))
## [1] 0 0

The number of columns, however, is preserved:

dim(rbind(as.data.frame(t(mat1)), as.data.frame(t(mat2))))
## [1] 0 2

The number of rows after cbind are also preserved

dim(cbind(as.data.frame(mat1), as.data.frame(mat2)))
## [1] 2 0

Making rbind and number of rows the only exception1.

any doesn’t work on data.frames

any() works on matrices but not on data.frames.

any(data.frame(A=TRUE, B=FALSE))
## Error in FUN(X[[i]], ...) :
##   only defined on a data frame with all numeric variables

Surprisingly it works on numeric data.frames:

any(data.frame(A=0, B=1))
## [1] TRUE
##
## Warning message:
## In any(c(0, 1), na.rm = FALSE) :
##   coercing argument of type 'double' to logical

rbind on nested data.frames

rbind() on data.frame do not adjust row names to be unique when the data.frame is nested.

df    <- data.frame(a=1:3)
df$df <- data.frame(a=1:3)

rbind(df, df)
## Error in `.rowNamesDF<-`(x, value = value) :
##   duplicate 'row.names' are not allowed
## In addition: Warning message:
## non-unique values when setting 'row.names': ‘1’, ‘2’, ‘3’

Statistical Hypothesis Tests

Inconsistencies when applying null hypothesis tests.

flinger.test and constant values

fligner.test() can return significant p-value for constant variance.

fligner.test(c(1,1,2,2), c("a","a","b","b"))
## 
## 	Fligner-Killeen test of homogeneity of variances
## 
## data:  c(1, 1, 2, 2) and c("a", "a", "b", "b")
## Fligner-Killeen:med chi-squared = NaN, df = 1, p-value = NA

fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
## 
## 	Fligner-Killeen test of homogeneity of variances
## 
## data:  c(1, 1, 1, 2, 2, 2) and c("a", "a", "a", "b", "b", "b")
## Fligner-Killeen:med chi-squared = Inf, df = 1, p-value < 2.2e-16

paired wilcoxon.test and ties

Paired versions of wilcoxon.test() has tolerance issues when detecting if ties are present.

wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE)
## Warning in wilcox.test.default(c(4, 3, 2), c(3, 2, 1), paired = TRUE): cannot compute
## exact p-value with ties
## 
## 	Wilcoxon signed rank test with continuity correction
## 
## data:  c(4, 3, 2) and c(3, 2, 1)
## V = 6, p-value = 0.1489
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), paired=TRUE)
## 
## 	Wilcoxon signed rank test
## 
## data:  c(0.4, 0.3, 0.2) and c(0.3, 0.2, 0.1)
## V = 6, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0

paired wilcoxon.test and Inf

All versions of wilcoxon.test() remove infinite values before proceeding, except when paired=TRUE.

With non-paired version Inf values are removed and results are different:

wilcox.test(c(1,2,3,4), c(0,9,8,7))
## 
## 	Wilcoxon rank sum test
## 
## data:  c(1, 2, 3, 4) and c(0, 9, 8, 7)
## W = 4, p-value = 0.3429
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(c(1,2,3,4), c(0,9,8,Inf))
## 
## 	Wilcoxon rank sum test
## 
## data:  c(1, 2, 3, 4) and c(0, 9, 8, Inf)
## W = 4, p-value = 0.6286
## alternative hypothesis: true location shift is not equal to 0

Paired version leaves Inf and includes it in the ranks, making these results equivalent:

wilcox.test(c(1,2,3,4), c(0,9,8,7), paired=TRUE)
## 
## 	Wilcoxon signed rank test
## 
## data:  c(1, 2, 3, 4) and c(0, 9, 8, 7)
## V = 1, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(c(1,2,3,4), c(0,9,8,Inf), paired=TRUE)
## 
## 	Wilcoxon signed rank test
## 
## data:  c(1, 2, 3, 4) and c(0, 9, 8, Inf)
## V = 1, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0

paired wilcoxon.test and warning

Paired wilcox.test() warns about x when y observations are missing.

wilcox.test(c(1,2), c(NA_integer_,NA_integer_), paired=TRUE)
## Error in wilcox.test.default(c(1, 2), c(NA_integer_, NA_integer_), paired = TRUE): not enough (finite) 'x' observations

wilcoxon.test output details

wilcox.test() result does not indicate if exact test was used or not.

wilcox.test(rnorm(10), exact=FALSE, correct=FALSE)
## 
## 	Wilcoxon signed rank test
## 
## data:  rnorm(10)
## V = 32, p-value = 0.6465
## alternative hypothesis: true location is not equal to 0

wilcox.test(rnorm(10), exact=TRUE, correct=FALSE)
## 
## 	Wilcoxon signed rank test
## 
## data:  rnorm(10)
## V = 50, p-value = 0.01953
## alternative hypothesis: true location is not equal to 0

wilcoxon.test and dead lines

In wilcox.test() the code has a few correct <- FALSE lines that seem to do nothing.

Example:

tail(stats:::wilcox.test.default, 22)
##                                                                                           
## 332                 correct <- FALSE                                                      
## 333                 ESTIMATE <- c(`difference in location` = uniroot(W,                   
## 334                   lower = mumin, upper = mumax, tol = 0.0001)$root)                   
## 335             }                                                                         
## 336             if (exact && TIES) {                                                      
## 337                 warning("cannot compute exact p-value with ties")                     
## 338                 if (conf.int)                                                         
## 339                   warning("cannot compute exact confidence intervals with ties")      
## 340             }                                                                         
## 341         }                                                                             
## 342     }                                                                                 
## 343     names(mu) <- if (paired || !is.null(y))                                           
## 344         "location shift"                                                              
## 345     else "location"                                                                   
## 346     RVAL <- list(statistic = STATISTIC, parameter = NULL, p.value = as.numeric(PVAL), 
## 347         null.value = mu, alternative = alternative, method = METHOD,                  
## 348         data.name = DNAME)                                                            
## 349     if (conf.int)                                                                     
## 350         RVAL <- c(RVAL, list(conf.int = cint, estimate = ESTIMATE))                   
## 351     class(RVAL) <- "htest"                                                            
## 352     RVAL                                                                              
## 353 }

var.test() and conf.level

var.test does not accept conf.level of either 0 or 1, while t.test does.

t.test(rnorm(10), rnorm(10), conf.level=0)
## 
## 	Welch Two Sample t-test
## 
## data:  rnorm(10) and rnorm(10)
## t = 0.02365, df = 16.365, p-value = 0.9814
## alternative hypothesis: true difference in means is not equal to 0
## 0 percent confidence interval:
##  0.008323452 0.008323452
## sample estimates:
##  mean of x  mean of y 
## -0.4080222 -0.4163456

var.test(rnorm(10), rnorm(10), conf.level=0)
## Error in var.test.default(rnorm(10), rnorm(10), conf.level = 0): 'conf.level' must be a single number between 0 and 1