Файл: R_inferno.pdf

Скачать файл (0,90Мб)

Заказать решение

ВУЗ: Не указан

Категория: Не указан

Дисциплина: Не указана

Добавлен: 06.04.2021

Просмотров: 899

Скачиваний: 1

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

8.1.60

too much iteration

for(i in 1:length(x))

{

...

}

is fine if x has a positive length. However, if its length is zero, then R wants to
do two iterations. A safer idiom is:

for(i in seq along(x))

{

...

}

or if you want to be compatible with S+:

for(i in seq(along=x))

{

...

}

8.1.61

wrong iterate

The

for

iterate can be from any vector. This makes looping much more general

than in most other languages, but can allow some users to become confused:

nums <- seq(-1, 1, by=.01)
ans <- NULL
for(i in nums) ans[i] <- i^2

This has two things wrong with it. You should recognize that we have tried
(but failed) to visit Circle 2 (page

) here, and the index on

ans

is not what

the user is expecting. Better would be:

nums <- seq(-1, 1, by=.01)
ans <- numeric(length(nums))
for(i in seq(along=nums)) ans[i] <- nums[i]^2

Even better, of course, would be to avoid a loop altogether. That is possible in
this case, perhaps not in a real application.

8.1.62

wrong iterate (encore)

A loop like:

for(i in 0:9)

{

this.x <- x[i]
...

does not do as intended. While C and some other languages index from 0, R
indexes from 1. The unfortunate thing in this case is that an index of 0 is
allowed in R, it just doesn’t do what is wanted.

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

8.1.63

wrong iterate (yet again)

nam <- c(4, 7)

vec <- rep(0, length(nam))

names(vec) <- nam

for(i in nam) vec[i] <- 31

vec

0 NA 31 NA NA 31

8.1.64

iterate is sacrosanct

In the following loop there are two uses of ’i’.

for(i in 1:3)

{

cat("i is", i, "

n")

i <- rpois(1, lambda=100)

cat("end iteration", i, "

n")

}

i is 1
end iteration 93
i is 2
end iteration 91
i is 3
end iteration 101

The

that is created in the body of the loop is used during that iteration but

does not change the

that starts the next iteration. This is unlike a number of

other languages (including S+).

This is proof that R is hard to confuse, but such code will definitely confuse

humans. So avoid it.

8.1.65

wrong sequence

seq(0:10)

[1]

9 10 11

0:10

[1]

9 10

seq(0, 10)

[1]

9 10

What was meant was either the second or third command, but mixing them
together gets you the wrong result.

8.1.66

empty string

Do not confuse

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

character(0)

with

The first is a vector of length zero whose elements would be character if it had
any. The second is a vector of length one, and the element that it has is the
empty string.

The result of

nchar

on the first object is a numeric vector of length zero,

while the result of

nchar

on the second object is 0—that is, a vector of length

one whose first and only element is zero.

nchar(character(0))

numeric(0)
>

nchar("")

[1] 0

8.1.67

NA the string

There is a missing value for character data. In normal printing (with quotes
around strings) the missing value is printed as NA; but when quotes are not
used, it is printed as

. This is to distinguish it from the string ’NA’:

cna <- c(’missing value’=NA, ’real string’=’NA’)

cna

missing value

real string

"NA"

noquote(cna)

missing value

real string

<NA>

NA the string really does happen. It is Nabisco in finance, North America
in geography, and possibly sodium in chemistry. There are circumstances—
particularly when reading data into R—where NA the string becomes NA the
missing value. Having a name or dimname that is accidentally a missing value
can be an unpleasant experience.

If you have missing values in a character vector, you may want to take some

evasive action when operating on the vector:

people <- c(’Alice’, NA, ’Eve’)

paste(’hello’, people)

[1] "hello Alice" "hello NA"

"hello Eve"

ifelse(is.na(people), people, paste(’hello’, people))

[1] "hello Alice" NA

"hello Eve"

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

8.1.68

capitalization

Some people have a hard time with the fact that R is case-sensitive. Being case-
sensitive is a good thing. The case of letters REALLy doEs MakE a diFFerencE.

8.1.69

scoping

Scoping problems are uncommon in R because R uses scoping rules that are
intuitive in almost all cases. An issue with scoping is most likely to arise when
moving S+ code into R.

Perhaps you want to know what “scoping” means. In the evaluator if at

some point an object of a certain name,

say, is needed, then we need to know

where to look for

. Scoping is the set of rules of where to look.

Here is a small example:

z <- ’global’

myTopFun

function ()

{

subfun <- function()

{

paste(’used:’, z)

}

z <- ’inside myTopFun’
subfun()

}

myTopFun()

[1] "used: inside myTopFun"

The

that is used is the one inside the function. Let’s think a bit about what is

not

happening. At the point in time that

subfun

is defined, the only

about is

the one in the global environment.

When

the object is assigned is not important.

Where

the object is assigned is important. Also important is the state of the

relevant environments when the function is evaluated.

8.1.70

scoping (encore)

The most likely place to find a scoping problem is with the modeling functions.

Let’s explore with some examples.

scope1

function ()

{

sub1 <- function(form) coef(lm(form))
xx <- rnorm(12)
yy <- rnorm(12, xx)
form1 <- yy ~ xx
sub1(form1)

}

scope1()

8.1. GHOSTS

CIRCLE 8. BELIEVING IT DOES AS INTENDED

(Intercept)

-0.07609548 1.33319273
>

scope2

function ()

{

sub2 <- function()

{

form2 <- yy ~ xx
coef(lm(form2))

}

xx <- rnorm(12)
yy <- rnorm(12, xx)
sub2()

}

scope2()

(Intercept)

-0.1544372 0.2896239

The

scope1

and

scope2

functions are sort of doing the same thing. But

scope3

is different—it is stepping outside of the natural nesting of environments.

sub3

function ()

{

form3 <- yy ~ xx
coef(lm(form3))

}

scope3

function ()

{

xx <- rnorm(12)
yy <- rnorm(12, xx)
sub3()

}

scope3()

Error in eval(expr, envir, enclos) : Object "yy" not found

One lesson here is that the environment of the calling function is not (necessar-
ily) searched. (In technical terms that would be dynamic scope rather than the
lexical scope that R uses.)

There are of course solutions to this problem.

scope4

solves the problem by

saying where to look for the data to which the formula refers.

sub4

function (data)

{

form4 <- yy ~ xx
coef(lm(form4, data=data))

}

scope4

function ()

{