ВУЗ: Не указан
Категория: Не указан
Дисциплина: Не указана
Добавлен: 06.04.2021
Просмотров: 900
Скачиваний: 1
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
by
copies the
tapply
behavior:
>
by(9, factor(1, levels=1:2), sum)
factor(1, levels = 1:2): 1
[1] 9
------------------------------------------------------------
factor(1, levels = 1:2): 2
[1] NA
aggregate
drops the empty cell:
>
aggregate(9, list(factor(1, levels=1:2)), sum)
Group.1 x
1
1 9
You can get the “right” answer for the empty cell via
split
and
sapply
:
>
sapply(split(9, factor(1, levels=1:2)), sum)
1 2
9 0
This behavior depends on the default value of
drop=FALSE
in
split
.
8.1.49
arithmetic that mixes matrices and vectors
To do matrix multiplication between a matrix and a vector you do:
xmat %*% yvec
or
yvec %*% xmat
R is smart enough to orient the vector in the way that makes sense. There is
no need to coerce the vector to a matrix.
If you want to multiply each row of a matrix by the corresponding element
of a vector, then do:
xmat * yvec
or
yvec * xmat
This works because of the order in which the elements of the matrix are stored
in the underlying vector.
But what to do if you want to multiply each column by the corresponding
element of a vector? If you do:
70
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
xmat * yvec
R does not check that the length of
yvec
matches the number of columns of
xmat
and do the multiplication that you want. It does a multiplication that you
don’t want. There are a few ways to get your multiplication, among them are:
xmat * rep(yvec, each=nrow(xmat))
and
sweep(xmat, 2, yvec, ’*’)
The
sweep
function is quite general—make friends with it. The
scale
function
can be useful for related problems.
8.1.50
single subscript of a data frame or array
Be careful of the number of commas when subscripting data frames and ma-
trices. It is perfectly acceptable to subscript with no commas—this treats the
object as its underlying vector rather than a two dimensional object. In the case
of a data frame, the underlying object is a list and the single subscript refers to
the columns of the data frame. For matrices the underlying object is a vector
with length equal to the number of rows times the number of columns.
8.1.51
non-numeric argument
>
median(x)
Error in median.default(x) : need numeric data
If you get an error like this, it could well be because x is a factor.
8.1.52
round rounds to even
The
round
function rounds to even if it is rounding off an exact 5.
Some people are surprised by this. I’m surprised that they are surprised—
rounding to even is the sensible thing to do. If you want a function that rounds
up, write it yourself (possibly using the
ceiling
and
floor
functions, or by
slightly increasing the size of the numbers).
Some times there is the surprise that an exact 5 is not rounded to even. This
will be due to Circle 1 (page
)—what is apparently an exact 5 probably isn’t.
8.1.53
creating empty lists
You create a numeric vector of length 500 with:
numeric(500)
So obviously you create a list of length 500 with:
71
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
list(500)
Right?
No. A touch of finesse is needed:
vector(’list’, 500)
Note that this command hints at the fact that lists are vectors in some sense.
When “vector” is used in the sense of an object that is not a list, it is really
shorthand for “atomic vector”.
8.1.54
list subscripting
my.list <- list(’one’, rep(2, 2))
There is a difference between
my.list[[1]]
and
my.list[1]
The first is likely what you want—the first component of the list. The second is
a list of length one whose component is the first component of the original list.
>
my.list[[1]]
[1] "one"
>
my.list[1]
[[1]]
[1] "one"
>
is.list(my.list[[1]])
[1] FALSE
>
is.list(my.list[1])
[1] TRUE
Here are some guidelines:
•
single brackets always give you back the same type of object – a list in
this case.
•
double brackets need not give you the same type of object.
•
double brackets always give you one item.
•
single brackets can give you any number of items.
72
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.55
NULL or delete
If you have a list
xl
and you want component
comp
not to be there any more,
you have some options. If
comp
is the index of the component in question, then
the most transparent approach is:
xl <- xl[-comp]
In any case you can do:
xl[[comp]] <- NULL
or
xl[comp] <- NULL
The first two work in S+ as well, but the last one does not—it has no effect in
S+.
If you want the component to stay there but to be
NULL
, then do:
xl[comp] <- list(NULL)
Notice single brackets, not double brackets.
8.1.56
disappearing components
A
for
loop can drop components of a list that it is modifying.
>
xl.in <- list(A=c(a=3, z=4), B=NULL, C=c(w=8), D=NULL)
>
xl.out <- vector(’list’, 4); names(xl.out) <- names(xl.in)
>
for(i in 1:4) xl.out[[i]] <- names(xl.in[[i]])
>
xl.out
# not right
$A
[1] "a" "z"
$C
NULL
$D
[1] "w"
>
xl.out2 <- lapply(xl.in, names)
>
xl.out2
$A
[1] "a" "z"
$B
NULL
$C
[1] "w"
$D
NULL
73
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
Note that the result from our
for
loop is MOST decidedly not what we want.
Possibly not even what we could have dreamed we could get.
Take care when
NULL
can be something that is assigned into a component of
a list. Using
lapply
can be a good alternative.
8.1.57
combining lists
Some people are pleasantly surprised that the c function works with lists. Then
they go on to abuse it.
>
xlis <- list(A=1:4, B=c(’a’, ’x’))
>
c(xlis, C=6:5)
$A
[1] 1 2 3 4
$B
[1] "a" "x"
$C1
[1] 6
$C2
[1] 5
Probably not what was intended. Try:
c(xlis, list(C=6:5))
8.1.58
disappearing loop
Consider the loop:
for(i in 1:10) i
It is a common complaint that this loop doesn’t work, that it doesn’t do any-
thing. Actually it works perfectly well. The problem is that no real action is
involved in the loop. You probably meant something like:
for(i in 1:10) print(i)
Automatic printing of unassigned objects only happens at the top level.
8.1.59
limited iteration
One of my favorite tricks is to only give the top limit of iteration rather than
the sequence:
for(i in trials)
{
...
}
rather than
for(i in 1:trials)
{
...
}
Then I wonder why the results are so weird.
74