ВУЗ: Не указан
Категория: Не указан
Дисциплина: Не указана
Добавлен: 06.04.2021
Просмотров: 899
Скачиваний: 1
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.60
too much iteration
for(i in 1:length(x))
{
...
}
is fine if x has a positive length. However, if its length is zero, then R wants to
do two iterations. A safer idiom is:
for(i in seq along(x))
{
...
}
or if you want to be compatible with S+:
for(i in seq(along=x))
{
...
}
8.1.61
wrong iterate
The
for
iterate can be from any vector. This makes looping much more general
than in most other languages, but can allow some users to become confused:
nums <- seq(-1, 1, by=.01)
ans <- NULL
for(i in nums) ans[i] <- i^2
This has two things wrong with it. You should recognize that we have tried
(but failed) to visit Circle 2 (page
) here, and the index on
ans
is not what
the user is expecting. Better would be:
nums <- seq(-1, 1, by=.01)
ans <- numeric(length(nums))
for(i in seq(along=nums)) ans[i] <- nums[i]^2
Even better, of course, would be to avoid a loop altogether. That is possible in
this case, perhaps not in a real application.
8.1.62
wrong iterate (encore)
A loop like:
for(i in 0:9)
{
this.x <- x[i]
...
does not do as intended. While C and some other languages index from 0, R
indexes from 1. The unfortunate thing in this case is that an index of 0 is
allowed in R, it just doesn’t do what is wanted.
75
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.63
wrong iterate (yet again)
>
nam <- c(4, 7)
>
vec <- rep(0, length(nam))
>
names(vec) <- nam
>
for(i in nam) vec[i] <- 31
>
vec
4
7
0
0 NA 31 NA NA 31
8.1.64
iterate is sacrosanct
In the following loop there are two uses of ’i’.
>
for(i in 1:3)
{
+
cat("i is", i, "
\
n")
+
i <- rpois(1, lambda=100)
+
cat("end iteration", i, "
\
n")
+
}
i is 1
end iteration 93
i is 2
end iteration 91
i is 3
end iteration 101
The
i
that is created in the body of the loop is used during that iteration but
does not change the
i
that starts the next iteration. This is unlike a number of
other languages (including S+).
This is proof that R is hard to confuse, but such code will definitely confuse
humans. So avoid it.
8.1.65
wrong sequence
>
seq(0:10)
[1]
1
2
3
4
5
6
7
8
9 10 11
>
0:10
[1]
0
1
2
3
4
5
6
7
8
9 10
>
seq(0, 10)
[1]
0
1
2
3
4
5
6
7
8
9 10
What was meant was either the second or third command, but mixing them
together gets you the wrong result.
8.1.66
empty string
Do not confuse
76
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
character(0)
with
""
The first is a vector of length zero whose elements would be character if it had
any. The second is a vector of length one, and the element that it has is the
empty string.
The result of
nchar
on the first object is a numeric vector of length zero,
while the result of
nchar
on the second object is 0—that is, a vector of length
one whose first and only element is zero.
>
nchar(character(0))
numeric(0)
>
nchar("")
[1] 0
8.1.67
NA the string
There is a missing value for character data. In normal printing (with quotes
around strings) the missing value is printed as NA; but when quotes are not
used, it is printed as
<
NA
>
. This is to distinguish it from the string ’NA’:
>
cna <- c(’missing value’=NA, ’real string’=’NA’)
>
cna
missing value
real string
NA
"NA"
>
noquote(cna)
missing value
real string
<NA>
NA
NA the string really does happen. It is Nabisco in finance, North America
in geography, and possibly sodium in chemistry. There are circumstances—
particularly when reading data into R—where NA the string becomes NA the
missing value. Having a name or dimname that is accidentally a missing value
can be an unpleasant experience.
If you have missing values in a character vector, you may want to take some
evasive action when operating on the vector:
>
people <- c(’Alice’, NA, ’Eve’)
>
paste(’hello’, people)
[1] "hello Alice" "hello NA"
"hello Eve"
>
ifelse(is.na(people), people, paste(’hello’, people))
[1] "hello Alice" NA
"hello Eve"
77
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.68
capitalization
Some people have a hard time with the fact that R is case-sensitive. Being case-
sensitive is a good thing. The case of letters REALLy doEs MakE a diFFerencE.
8.1.69
scoping
Scoping problems are uncommon in R because R uses scoping rules that are
intuitive in almost all cases. An issue with scoping is most likely to arise when
moving S+ code into R.
Perhaps you want to know what “scoping” means. In the evaluator if at
some point an object of a certain name,
z
say, is needed, then we need to know
where to look for
z
. Scoping is the set of rules of where to look.
Here is a small example:
>
z <- ’global’
>
myTopFun
function ()
{
subfun <- function()
{
paste(’used:’, z)
}
z <- ’inside myTopFun’
subfun()
}
>
myTopFun()
[1] "used: inside myTopFun"
The
z
that is used is the one inside the function. Let’s think a bit about what is
not
happening. At the point in time that
subfun
is defined, the only
z
about is
the one in the global environment.
When
the object is assigned is not important.
Where
the object is assigned is important. Also important is the state of the
relevant environments when the function is evaluated.
8.1.70
scoping (encore)
The most likely place to find a scoping problem is with the modeling functions.
Let’s explore with some examples.
>
scope1
function ()
{
sub1 <- function(form) coef(lm(form))
xx <- rnorm(12)
yy <- rnorm(12, xx)
form1 <- yy ~ xx
sub1(form1)
}
>
scope1()
78
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
(Intercept)
xx
-0.07609548 1.33319273
>
scope2
function ()
{
sub2 <- function()
{
form2 <- yy ~ xx
coef(lm(form2))
}
xx <- rnorm(12)
yy <- rnorm(12, xx)
sub2()
}
>
scope2()
(Intercept)
xx
-0.1544372 0.2896239
The
scope1
and
scope2
functions are sort of doing the same thing. But
scope3
is different—it is stepping outside of the natural nesting of environments.
>
sub3
function ()
{
form3 <- yy ~ xx
coef(lm(form3))
}
>
scope3
function ()
{
xx <- rnorm(12)
yy <- rnorm(12, xx)
sub3()
}
>
scope3()
Error in eval(expr, envir, enclos) : Object "yy" not found
One lesson here is that the environment of the calling function is not (necessar-
ily) searched. (In technical terms that would be dynamic scope rather than the
lexical scope that R uses.)
There are of course solutions to this problem.
scope4
solves the problem by
saying where to look for the data to which the formula refers.
>
sub4
function (data)
{
form4 <- yy ~ xx
coef(lm(form4, data=data))
}
>
scope4
function ()
{
xx <- rnorm(12)
79