ВУЗ: Не указан
Категория: Не указан
Дисциплина: Не указана
Добавлен: 06.04.2021
Просмотров: 903
Скачиваний: 1
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
There is also the possibility of using different locales. The locale can affect
the order in which strings are sorted into.
The freedom of multiple string encodings and multiple locales gives you the
chance to spend hours confusing yourself by mixing them.
For more information, do:
>
?Encoding
>
?locales
8.1.25
paths in Windows
Quite unfortunately Windows uses the backslash to separate directories in paths.
Consider the R command:
attach(’C:
\
tmp
\
foo’)
This is confusing the two faces of strings. What that string actually contains is:
C, colon, tab, m, p, formfeed, o, o. No backslashes at all. What should really
be said is:
attach(’C:
\\
tmp
\\
foo’)
However, in all (or at least virtually all) cases R allows you to use slashes in
place of backslashes in Windows paths—it does the translation under the hood:
attach(’C:/tmp/foo’)
If you try to copy and paste a Windows path into R, you’ll get a string (which is
wrong) along with some number of warnings about unrecognized escapes. One
approach is to paste into a command like:
scan(’’, ’’, n=1)
8.1.26
quotes
There are three types of quote marks, and a cottage industry has developed in
creating R functions that include the string “quote”. Table
lists functions
that concern quoting in various ways. The
bquote
function is generally the
most useful—it is similar to
substitute
.
Double-quotes and single-quotes—essentially synonymous—are used to de-
limit character strings. If the quote that is delimiting the string is inside the
string, then it needs to be escaped with a backslash.
>
’"’
[1] "
\
""
A backquote (also called “backtick”) is used to delimit a name, often a name
that breaks the usual naming conventions of objects.
60
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
Table 8.2: Functions to do with quotes.
function
use
bquote
substitute items within
.()
noquote
print strings without surrounding quotes
quote
language object of unevaluated argument
Quote
alias for
quote
dQuote
add double left and right quotes
sQuote
add single left and right quotes
shQuote
quote for operating system shell
>
’3469’
[1] "3469"
>
8
3469
8
Error: Object "3469" not found
>
8
2
8
<- 2.5
>
8
2
8
+
8
2
8
[1] 5
8.1.27
backquotes
Backquotes are used for names of list components that are reserved words and
other “illegal” names. No need to panic.
>
ll3 <- list(A=3, NA=4)
Error: unexpected ’=’ in "ll3 <- list(A=3, NA="
>
ll3 <- list(A=3, ’NA’=4)
>
ll3 <- list(A=3, ’NA’=4, ’for’=5)
>
ll3
$A
[1] 3
$
8
NA
8
[1] 4
$
8
for
8
[1] 5
>
ll3$’for’
[1] 5
Although the component names are printed using backquotes, you can access
the components using either of the usual quotes if you like. The initial attempt
to create the list fails because the
NA
was expected to be the data for the second
(nameless) component.
61
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.28
disappearing attributes
Most coercion functions strip the attributes from the object. For example, the
result of:
as.numeric(xmat)
will not be a matrix. A command that does the coercion but keeps the attributes
is:
storage.mode(xmat) <- ’numeric’
8.1.29
disappearing attributes (reprise)
>
x5 <- 1
>
attr(x5, ’comment’) <- ’this is x5’
>
attributes(x5)
$comment
[1] "this is x5"
>
attributes(x5[1])
NULL
Subscripting almost always strips almost all attributes.
If you want to keep attributes, then one solution is to create a class for your
object and write a method for that class for the
8
[
8
function.
8.1.30
when space matters
Spaces, or their lack, seldom make a difference in R commands. Except that
spaces can make it much easier for humans to read (recall Uwe’s Maxim, page
There is an instance where space does matter to the R parser. Consider the
statement:
x<-3
This could be interpreted as either
x <- 3
or
x < -3
This should prompt you to use the spacebar on your keyboard. Most important
to make code legible to humans is to put spaces around the
8
<-
8
operator. Un-
fortunately that does not solve the problem in this example—it is in comparisons
that the space is absolutely required.
62
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.31
multiple comparisons
0 < x < 1
seems like a reasonable way to test if
x
is between 0 and 1. R doesn’t think so.
The command that R agrees with is:
0 < x & x < 1
8.1.32
name masking
By default
T
and
F
are assigned to
TRUE
and
FALSE
, respectively. However, they
can be used as object names (but in S+ they can not be). This leads to two
suggestions:
1. It is extremely good practice to use
TRUE
and
FALSE
rather than
T
and
F
.
2. It is good practice to avoid using
T
and
F
as object names in order not to
collide with those that failed to follow suggestion 1.
It is also advisable to avoid using the names of common functions as object
names. Two favorites are
c
and
t
.
And don’t call your matrix
matrix
, see:
fortune(’dog’)
Usually masking objects is merely confusing. However, if you mask a popular
function name with your own
function
, it can verge on suicidal.
>
c <- function(x) x * 100
>
par(mfrow=c(2, 2))
Error in c(2, 2) : unused argument(s) (2)
If you get an extraordinarily strange error, it may be due to masking. Evasive
action after the fact includes:
find(’c’)
if you know which function is the problem. To find the problem, you can try:
conflicts(detail=TRUE)
Another possibility for getting out of jail is to start R with
--vanilla
.
8.1.33
more sorting than sort
The
order
function is probably what you are looking for when
sort
doesn’t do
the sorting that you want. Uses of
order
include:
•
sorting the rows of a matrix or data frame.
•
sorting one vector based on values of another.
•
breaking ties with additional variables.
63
8.1. GHOSTS
CIRCLE 8. BELIEVING IT DOES AS INTENDED
8.1.34
sort.list not for lists
Do not be thinking that
sort.list
is to sort lists. You silly fool.
In fact sorting doesn’t work on lists:
>
sort(as.list(1:20))
Error in sort.int(x, na.last = na.last, ...) :
’x’ must be atomic
>
sort.list(as.list(1:20))
Error in sort.list(as.list(1:20)) : ’x’ must be atomic
Have you called ’sort’ on a list?
If you have lists that you want sorted in some way, you’ll probably need to write
your own function to do it.
8.1.35
search list shuffle
attach
and
load
are very similar in purpose, but different in effect.
attach
creates a new item in the search list while
load
puts its contents into the global
environment (the first place in the search list).
Often
attach
is the better approach to keep groups of objects separate.
However, if you change directory into a location and want to have the existing
.RData
, then
load
is probably what you want.
Here is a scenario (that you don’t want):
•
There exists a
.RData
in directory
project1
.
•
You start R in some other directory and then change directory to
project1
.
•
The global environment is from the initial directory.
•
You attach
.RData
(from
project1
).
•
You do some work, exit and save the workspace.
•
You have just wiped out the original
.RData
in
project1
, losing the data
that was there.
8.1.36
source versus attach or load
Both
attach
and
load
put R objects onto the search list. The
source
function
does that as well, but when the starting point is code to create objects rather
than actual objects.
There are conventions to try to keep straight which you should do. Files
of R code are often the extension “
.R
”. Other extensions for this include “
.q
”,
“
.rt
”, “
.Rscript
”.
Extension for files of R objects include “
.rda
” and “
.RData
”.
64