Файл: R_inferno.pdf

Скачать файл (0,90Мб)

Заказать решение

ВУЗ: Не указан

Категория: Не указан

Дисциплина: Не указана

Добавлен: 06.04.2021

Просмотров: 900

Скачиваний: 1

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

copies the

tapply

behavior:

by(9, factor(1, levels=1:2), sum)

factor(1, levels = 1:2): 1
[1] 9
------------------------------------------------------------
factor(1, levels = 1:2): 2
[1] NA

aggregate

drops the empty cell:

aggregate(9, list(factor(1, levels=1:2)), sum)

Group.1 x
1

1 9

You can get the “right” answer for the empty cell via

split

and

sapply

sapply(split(9, factor(1, levels=1:2)), sum)

1 2
9 0

This behavior depends on the default value of

drop=FALSE

split

8.1.49

arithmetic that mixes matrices and vectors

To do matrix multiplication between a matrix and a vector you do:

xmat %*% yvec

yvec %*% xmat

R is smart enough to orient the vector in the way that makes sense. There is
no need to coerce the vector to a matrix.

If you want to multiply each row of a matrix by the corresponding element

of a vector, then do:

xmat * yvec

yvec * xmat

This works because of the order in which the elements of the matrix are stored
in the underlying vector.

But what to do if you want to multiply each column by the corresponding

element of a vector? If you do:

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

xmat * yvec

R does not check that the length of

yvec

matches the number of columns of

xmat

and do the multiplication that you want. It does a multiplication that you

don’t want. There are a few ways to get your multiplication, among them are:

xmat * rep(yvec, each=nrow(xmat))

and

sweep(xmat, 2, yvec, ’*’)

The

sweep

function is quite general—make friends with it. The

scale

function

can be useful for related problems.

8.1.50

single subscript of a data frame or array

Be careful of the number of commas when subscripting data frames and ma-
trices. It is perfectly acceptable to subscript with no commas—this treats the
object as its underlying vector rather than a two dimensional object. In the case
of a data frame, the underlying object is a list and the single subscript refers to
the columns of the data frame. For matrices the underlying object is a vector
with length equal to the number of rows times the number of columns.

8.1.51

non-numeric argument

median(x)

Error in median.default(x) : need numeric data

If you get an error like this, it could well be because x is a factor.

8.1.52

round rounds to even

The

round

function rounds to even if it is rounding off an exact 5.

Some people are surprised by this. I’m surprised that they are surprised—

rounding to even is the sensible thing to do. If you want a function that rounds
up, write it yourself (possibly using the

ceiling

and

floor

functions, or by

slightly increasing the size of the numbers).

Some times there is the surprise that an exact 5 is not rounded to even. This

will be due to Circle 1 (page

)—what is apparently an exact 5 probably isn’t.

8.1.53

creating empty lists

You create a numeric vector of length 500 with:

numeric(500)

So obviously you create a list of length 500 with:

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

list(500)

Right?

No. A touch of finesse is needed:

vector(’list’, 500)

Note that this command hints at the fact that lists are vectors in some sense.
When “vector” is used in the sense of an object that is not a list, it is really
shorthand for “atomic vector”.

8.1.54

list subscripting

my.list <- list(’one’, rep(2, 2))

There is a difference between

my.list[[1]]

and

my.list[1]

The first is likely what you want—the first component of the list. The second is
a list of length one whose component is the first component of the original list.

my.list[[1]]

[1] "one"
>

my.list[1]

[[1]]
[1] "one"
>

is.list(my.list[[1]])

[1] FALSE
>

is.list(my.list[1])

[1] TRUE

Here are some guidelines:

•

single brackets always give you back the same type of object – a list in
this case.

•

double brackets need not give you the same type of object.

•

double brackets always give you one item.

•

single brackets can give you any number of items.

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

8.1.55

NULL or delete

If you have a list

and you want component

comp

not to be there any more,

you have some options. If

comp

is the index of the component in question, then

the most transparent approach is:

xl <- xl[-comp]

In any case you can do:

xl[[comp]] <- NULL

xl[comp] <- NULL

The first two work in S+ as well, but the last one does not—it has no effect in
S+.

If you want the component to stay there but to be

NULL

, then do:

xl[comp] <- list(NULL)

Notice single brackets, not double brackets.

8.1.56

disappearing components

for

loop can drop components of a list that it is modifying.

xl.in <- list(A=c(a=3, z=4), B=NULL, C=c(w=8), D=NULL)

xl.out <- vector(’list’, 4); names(xl.out) <- names(xl.in)

for(i in 1:4) xl.out[[i]] <- names(xl.in[[i]])

xl.out

# not right

$A
[1] "a" "z"
$C
NULL
$D
[1] "w"
>

xl.out2 <- lapply(xl.in, names)

xl.out2

$A
[1] "a" "z"
$B
NULL
$C
[1] "w"
$D
NULL

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

Note that the result from our

for

loop is MOST decidedly not what we want.

Possibly not even what we could have dreamed we could get.

Take care when

NULL

can be something that is assigned into a component of

a list. Using

lapply

can be a good alternative.

8.1.57

combining lists

Some people are pleasantly surprised that the c function works with lists. Then
they go on to abuse it.

xlis <- list(A=1:4, B=c(’a’, ’x’))

c(xlis, C=6:5)

$A
[1] 1 2 3 4
$B
[1] "a" "x"
$C1
[1] 6
$C2
[1] 5

Probably not what was intended. Try:

c(xlis, list(C=6:5))

8.1.58

disappearing loop

Consider the loop:

for(i in 1:10) i

It is a common complaint that this loop doesn’t work, that it doesn’t do any-
thing. Actually it works perfectly well. The problem is that no real action is
involved in the loop. You probably meant something like:

for(i in 1:10) print(i)

Automatic printing of unassigned objects only happens at the top level.

8.1.59

limited iteration

One of my favorite tricks is to only give the top limit of iteration rather than
the sequence:

for(i in trials)

{

...

}

rather than

for(i in 1:trials)

{