Section 3 Variables and vectors
We can assign a numerical value to what we refer to as a variable, and then use the variable within various R commands. For example
defines a variable called x
, which takes the value 3. You won’t see any output when you type this command, but if you type the variable name on its own, R will tell you its value:
## [1] 3
We can then use the variable in other commands, e.g.:
## [1] 6
Everything in R is case sensitive: x
is not the same as
X
.
3.1 Vectors
We can define a vector variable using the command c()
, with a list of the elements in your vector, separated by commas, inside the brackets. For example, to create a vector of the numbers 2, 4, 6, 8, 10, and assign it to a variable y
type
We can do element-wise operations with two vectors. For example:
## [1] 5 9 13 17 21
3.2 Testing for equality and inequalties
Given a vector such as
## [1] 3 4 5 6 7 8 9 10
we can test to see if elements of this vector equal a particular value, e.g.
## [1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
which produces another vector, where the i-th element is TRUE
if the i-th element of x
is equal to 4
, and FALSE
otherwise. Similarly, we can test for an inequality, for example
## [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
If we sum
the result, each TRUE
is counted as a 1, and each FALSE
is counted as a 0, so we can find out how many elements of x
satisfy the inequality (or equality):
## [1] 2
3.3 Subsetting vectors
Suppose we have first defined a vector x
:
We use square brackets []
to extract elements of x
. For example, to get the third element we do
## [1] 16
We can also replace elements of x
, for example
## [1] 12 0 16 18 20
3.4 Character strings
We can make vectors whose elements are text (known as strings or character strings) rather than numbers.
## [1] "Monday" "Tuesday" "Wednesday"
The quote marks " "
are important here: if, for examle, we tried
We would get the message Error: object 'Monday' not found
: R would attempt to find a variable with the name Monday
, rather than assigning the string "Monday"
to the variable y
.
3.5 Factors
In statistical modelling, we often work with categorical variables, for example, a patient’s symptoms might be recorded as one of “none”, “mild”, “moderate”, or “severe”. In R, we can have factor variables that are similar to strings, but which carry additional information about the possible levels. We create these with the factor()
command. For example
## [1] mild mild none severe
## Levels: mild none severe
Note that when we display our vector of factors x
, we do not see quotes, and the levels are also displayed.
When defining a factor, it may be helpful to specify all the possible levels, even if some levels have not been observed. We specify these in the factor command:
x <- factor(c("mild" ,"mild", "none", "severe"),
levels = c("none", "mild" ,"moderate", "severe"))
x
## [1] mild mild none severe
## Levels: none mild moderate severe
(Note that the first two lines in the input display are a single command: the line break after the first comma is ignored by R.)
3.6 The Environment window
In RStudio, you can see all the variables defined in your workspace in the Environment window. The Environment window will also list any data sets and functions that you have created; you can click on these for more details.
Exercise 3.2 Suppose we want to create a vector called responses
with three elements: yes
, no
and no
.
- Create the vector
responses
as a vector of character strings. - How would you define
responses
, if you instead wanted it to be a factor, with levelsyes
,no
andundecided
?