ВУЗ: Не указан
Категория: Не указан
Дисциплина: Не указана
Добавлен: 06.04.2021
Просмотров: 891
Скачиваний: 1
8.3. DEVILS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
1 1 a b c
2 2 d e f
3 3 a i j
4 4 a b c
5 5 d e f
6 6 g h i
7 j k l m
8 n
>
read.csv("test.csv", fill=FALSE)
Error in scan(file = file, what = what, ...
:
line 6 did not have 4 elements
The first 5 lines of the file are checked for consistency of the number of fields.
Use
count.fields
to check the whole file.
8.3.9
reading messy files
read.table
and its relatives are designed for files that are arranged in a tabular
form. Not all files are in tabular form. Trying to use
read.table
or a relative
on a file that is not tabular is folly—you can end up with mangled data.
Two functions used to read files with a more general layout are
scan
and
readLines
.
8.3.10
imperfection of writing then reading
Do not expect to write data to a file (such as with
write.table
), read the data
back into R and have that be precisely the same as the original. That is doing
two translations, and there is often something lost in translation.
You do have some choices to get the behavior that you want:
•
Use
save
to store the object and use
attach
or
load
to use it. This works
with multiple objects.
•
Use
dput
to write an ASCII representation of the object and use
dget
to
bring it back into R.
•
Use
serialize
to write and
unserialize
to read it back. (But the help
file warns that the format is subject to change.)
8.3.11
non-vectorized function in integrate
The
integrate
function expects a vectorized function. When it gives an argu-
ment of length 127, it expects to get an answer that is of length 127. It shares
its displeasure if that is not what it gets:
>
fun1 <- function(x) sin(x) + sin(x-1) + sin(x-2) + sin(x-3)
105
8.3. DEVILS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
>
integrate(fun1, 0, 2)
-1.530295 with absolute error < 2.2e-14
>
fun2 <- function(x) sum(sin(x - 0:3))
>
integrate(fun2, 0, 2)
Error in integrate(fun2, 0, 2) :
evaluation of function gave a result of wrong length
In addition: Warning message:
longer object length
is not a multiple of shorter object length in: x - 0:3
>
fun3 <- function(x) rowSums(sin(outer(x, 0:3, ’-’)))
>
integrate(fun3, 0, 2)
-1.530295 with absolute error < 2.2e-14
fun1
is a straightforward implementation of what was wanted, but not easy
to generalize.
fun2
is an ill-conceived attempt at mimicking
fun1
.
fun3
is
a proper implementation of the function using
outer
as a step in getting the
vectorization correct.
8.3.12
non-vectorized function in outer
The function given to
outer
needs to be vectorized (in the usual sense):
>
outer(1:3, 4:1, max)
Error in dim(robj) <- c(dX, dY) :
dims [product 12] do not match the length of object [1]
>
outer(1:3, 4:1, pmax)
[,1] [,2] [,3] [,4]
[1,]
4
3
2
1
[2,]
4
3
2
2
[3,]
4
3
3
3
>
outer(1:3, 4:1, Vectorize(function(x, y) max(x, y)))
[,1] [,2] [,3] [,4]
[1,]
4
3
2
1
[2,]
4
3
2
2
[3,]
4
3
3
3
The
Vectorize
function can be used to transform a function (by essentially
adding a loop—it contains no magic to truly vectorize the function).
8.3.13
ignoring errors
You have a loop in which some of the iterations may produce an error. You
would like to ignore any errors and proceed with the loop. One solution is to
use
try
.
The code:
106
8.3. DEVILS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
ans <- vector(’list’, n)
for(i in seq(length.out=n))
{
ans[[i]] <- rpois(round(rnorm(1, 5, 10)), 10)
}
will fail when the number of Poisson variates requested is negative. This can be
modified to:
ans <- vector(’list’, n)
for(i in seq(length.out=n))
{
this.ans <- try(rpois(round(rnorm(1, 5, 10)), 10))
if(!inherits(this.ans, ’try-error’))
{
ans[[i]] <- this.ans
}
}
Another approach is to use
tryCatch
rather than
try
:
ans <- vector(’list’, n)
for(i in seq(length.out=n))
{
ans[[i]] <- tryCatch(rpois(round(rnorm(1, 5, 10)), 10),
error=function(e) NaN)
}
8.3.14
accidentally global
It is possible for functions to work where they are created, but not to work in
general. Objects within the function can be global accidentally.
>
myfun4 <- function(x) x + y
>
myfun4(30)
[1] 132
>
rm(y)
>
myfun4(30)
Error in myfun4(30) : Object "y" not found
The
findGlobals
function can highlight global objects:
>
library(codetools)
>
findGlobals(myfun4)
[1] "+" "y"
8.3.15
handling ...
The
8
...
8
construct can be a slippery thing to get hold of until you know the
trick. One way is to package it into a list:
107
8.3. DEVILS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
function(x, ...)
{
dots <- list(...)
if(’special.arg’ %in% names(dots))
{
# rest of function
}
Another way is to use
match.call
:
function(x, ...)
{
extras <- match.call(expand.dots=FALSE)$...
# rest of function
}
If your function processes the arguments, then you may need to use
do.call
:
function(x, ...)
{
# ...
dots <- list(...)
ans <- do.call(’my.other.fun’, c(list(x=x),
dots[names(dots) %in% spec]))
# ...
}
8.3.16
laziness
R uses lazy evaluation. That is, arguments to functions are not evaluated until
they are required. This can save both time and memory if it turns out the
argument is not required.
In extremely rare circumstances something is not evaluated that should be.
You can use
force
to get around the laziness.
>
xr <- lapply(11:14, function(i) function() i^2)
>
sapply(1:4, function(j) xr[[j]]())
[1] 196 196 196 196
>
xf <- lapply(11:14, function(i)
{
force(i); function() i^2
}
)
>
sapply(1:4, function(j) xf[[j]]())
[1] 121 144 169 196
Extra credit for understanding what is happening in the
xr
example.
8.3.17
lapply laziness
lapply
does not evaluate the calls to its
FUN
argument. Mostly you don’t care.
But it can have an effect if the function is generic. It is safer to say:
lapply(xlist, function(x) summary(x))
than to say:
lapply(xlist, summary)
108
8.3. DEVILS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.3.18
invisibility cloak
In rare circumstances the visibility of a result may not be as expected:
>
myfun6 <- function(x) x
>
myfun6(zz <- 7)
>
.Last.value
[1] 7
>
a6 <- myfun6(zz <- 9)
>
a6
[1] 9
>
myfun6(invisible(11))
>
myfun7 <- function(x) 1 * x
>
myfun7(invisible(11))
[1] 11
8.3.19
evaluation of default arguments
Consider:
>
myfun2 <- function(x, y=x) x + y
>
x <- 100
>
myfun2(2)
[1] 4
>
myfun2(2, x)
[1] 102
Some people expect the result of the two calls above to be the same. They
are not. The default value of an argument to a function is evaluated inside the
function, not in the environment calling the function.
Thus writing a function like the following will not get you what you want.
>
myfun3 <- function(x=x, y) x + y
>
myfun3(y=3)
Error in myfun3(y = 3) : recursive default argument reference
(The actual error message you get may be different in your version of R.)
The most popular error to make in this regard is to try to imitate the default
value of an argument. Something like:
>
myfun5 <- function(x, n=xlen)
{
xlen <- length(x); ...
}
>
myfun5(myx, n=xlen-2)
xlen
is defined inside
myfun5
and is not available for you to use when calling
myfun5
.
109