Go back to Index

Instructor: Edgar Franco

Outline

  1. Preliminaries

  2. R Basics: Setting the working directory

  3. Creating and Storing Basic Objects in R

3.1 Numeric variables

3.2 String variables

3.3 Vectors

1. PRELIMINARIES

Let’s start exploring the R studio windows.

R studio window

R studio window

  1. The main window is your script (upper left), this is what is actually stored and saved after you close R studio. Here you write all you code and comments

  2. The window in the lower left is the console, here you’ll be able to see your results. You can also use it to run quick code that you don’t need to store.

  3. The window in the upper right displays all the objects that you create.

  4. Finally, the window in the lower right is a multifucntion space where you can see plots, packages, get help, etc.

Now, we can start introducing code in your script:

COMMENTS. To insert comments use “#” Windows shortcut: control + shift + c

TIP: ALWAYS Comment your code

SHORTCUTS. To run a command directly from the script, place the cursor at the end of the command line and type: MAC users: command + enter Windows users: Control + r OR Control + enter

Just to check if R is working

# Try it now with the following command
1 + 1
# Now, calculate your first statistics
mean(1:5)
# Let's start with an empty working directory
rm(list = ls())   ### Remove all objects in the working environment
                  ### Use with caution!!
                  ### Similar to "clear all" in Stata

gc()    ### Garbage collector it can be useful to call gc after a large object has been removed, as this may prompt R to return memory to the operating system.

How to get help in R:

?mean             #opens the help page for the mean function
?"-"              #opens the help page for substraction
?"if"             #opens the help page for if
??summarizing     #searches for topics containing words like "summarizing"
??"least squares" #searches for topics containing phrases like this


### The function help and help.search do the same thing
help("mean")
help.search("least squares")


#### The apropos function find variables and functions that match this input 
apropos("vector")

### You can use apropos with regular expressions
### Example: Every function ending in "z"
apropos("z$")

2. R BASICS: SETTING THE WORKING DIRECTORY

The working directory is the place in your computer where R will be running the script To set the working directory you can use the drop down menu. You can also change the working directory by typing setwd(“Path”)

# PLEASE CHANGE YOUR WORKING DIRECTORY NOW (uncomment and type your own path)

setwd("Your directory")

We recommend setting the working directory from your script instead of using the drop down menu. This is particularly useful when working in teams or from a shared folder.

The working directory can be set in your computer, in your AFS space at Stanford, or in some other location that you can access through your computer.

For example, a working directory in the AFS space at Stanford looks like this

“/afs/ir.stanford.edu/users/g/r/grobles/Documents”

** NOTE FOR WINDOWS USERS

OR

You’ll now that this run correctly if it doesn’t display an error.

# To display the current working directory use getwd()

getwd()

Once the working directory is set, it is relatively simple to access files in sub-folders and parent folders. We will learn how to do that in this session.

3. Creating and storing basic objects in R

In short, an object is a data structure having some attributes and methods which act on its attributes.

In this section you’ll learn the basic R objects:

The command line prompt (>) in the console is an invitation to type commands or expressions After you write a command, type Enter to execute it However, it is more convenient to run the command directly from your script

Remember: MAC users: command + enter Windows users: Control + enter OR Control + R

# R can work as a calculator, for example,

2+3   #Addition
2*3   #Multiplication
2^3   #Exponentiation
2**3
2/3   ##Division

Now, going back to the first function that we created, we can use “:” to declare a from:to

1:5 

6:10

## What happens if we sum these 
1:5  + 6:12
## Warning in 1:5 + 6:12: longer object length is not a multiple of shorter
## object length
#Equivalent to
c(1,2,3,4,5) + c(6,7,8,9,10)

3.1. Numeric variables

Assigning Variables

In R you can create different objects and give them a name. To create/declare an object, use “<-” ‘<-’ means “the values on the right are assigned to the name on the left”.

Unlike other languages R you don’t have tp specify what type of variables you are creating

# 2.1. The simplest objects in R are scalars, for example
A <- 2

# An object called "A" that contains the number 2 is now in the workspace.
# You can call any object in the workspace by typing its name.
A

# This object "A" is now a global variable, which means that you can perform operations with this object by calling its name. For example,
A + 3
A * 3 
A * A

# A famous scalar already stored in R
pi

# NOTE: 
# Two different objects can't have the same name. R will overwrite the previous contents of the object with the new one.
A <- 3
A
A <- 2
A

# NOTE:
# There are alternative ways to declare objects in R
  ## You can change the order of the instruction
4 -> A  
A
  ## Or you can use the equal sign '='
A = 5
A
  ## Nevertheless, you cannot change the order of the instruction. 
#  
5 = A
## Error in 5 = A: invalid (do_set) left-hand side to assignment
# This will display an error

TIP: It can be better to use arrows ‘<-’ or ‘->’ rather than the equal sign. It’s easier to track the direction of the instruction, especially when creating new objects from old ones. For example, B <- A vs A <- B

Also, as we will see, the commands ‘==’ and ‘!=’ will be used for conditionals and this might create some confusion and errors in the code.

TIP: If you want to assign and print in one line you have two possibilities

# Use ;
k <- rnorm(5) ; k

# Use ()
(kk <- rnorm(5))

NOTE: Special numbers

Inf, -Inf, NaN and NA are special numeric values

c(Inf + 1, Inf-1, Inf-Inf)

c(sqrt(Inf), sin(Inf))
## Warning in sin(Inf): NaNs produced

We just saw how to create objects, but what about if you want to delete them:

#To keep track of the objects you've created so far use the function ls()

#The function ls() "lists" all user-defined objects in the workspace.

ls()
## [1] "A"  "k"  "kk"
### Advanced search:

ls(pattern ="A")
## [1] "A"
# To remove an object from the working space, use 'remove()' or 'rm()'
# Syntax: rm(object)
# For example, let's remove object A
remove(A)
ls()
## [1] "k"  "kk"
# To remove all objects in the workspace, type 'rm(list=ls())'
# or choose "Clear Workspace" in the drop down menu
rm(list = ls())
ls()
## character(0)
# Let's bring our objects back
A <- 5

3.2. String Variables

Objects containing strings can also be created in R.

String variables are declared by using quotes " " or apostrophes ‘’

B <- "R Workshop"
B
B <- 'R Workshop, Summer 2016'
B

TIP: Don’t forget to use and close quotes for string variables, otherwise, you might mistakenly call other objects in your workspace.

B <- "A"
B
B <- A
B

Let’s keep this string for now

B <- "R Workshop"

3.3. Vectors

Vectors of numbers or strings are another type of object in R. To create a vector, we need to “concatenate” a series of elements using the function “c()”. “c()” is probably the most important and often-used function in R. It creates a vector from a series of elements.

C <- c(100,200,300,400,550)
C
C <- c("red","blue","black")
C

NOTE: All vectors are column vectors. ** Also note that “c()” is a function and “C” is a vector in the workspace.**

c
C

Rember: To create a series of numbers, use “:” Syntax: “from:to”

1:10

This is vectorizable:

NOTE: This series is not an object of the workspace until you assigned it to an object.

C
C <- c(1:10)
C

To create a vector of repeated numbers or strings, you can use the function rep() Syntax: rep(value,times)

rep(2,10)
rep("index",5)

To create other sequences, use the function ‘seq()’ Syntax: seq(from,to,by)

seq(1,10)
seq(1,10,2)
seq(2,10,2)

To create random numbers, use the function ‘runif()’, it draws n values from a uniform distribution Syntax: runif(n)

runif(10)

NOTE: You can combine different functions to create your vectors

c(1,2,rep(3,3)) 

NOTE: R keeps a count on the number of elements in a vector.

This will be useful for selecting cases and subsampling data.

c(1:500)
seq(2,1000,2)

Note that “[ ]” indicates the position of an element in a vector.

To call an element in a vector, use the following notation: vector[position]

#For example:
C
C[2]        # Second element of vector C
C[c(2:4)]   # Elements 2 to 4 of the vector C


# You can also ask R to hide some elements.
C[-2]     # All elements of C except the second one

NOTE: R starts indexing with 1, most other languages strat indexing with 0

You can also explore and change some characteristics of the vector:

## length
length(C)
## [1] 10
## Add names

names(C) <- c("Stanford", "Harvard", "MIT", "Princeton", "Berkeley", "Columbia", "NYU", 
              "Oxford", "Cambridge", "Notre Dame")

C
##   Stanford    Harvard        MIT  Princeton   Berkeley   Columbia 
##          1          2          3          4          5          6 
##        NYU     Oxford  Cambridge Notre Dame 
##          7          8          9         10

3.3.1 Logical Vectors

There are three vectorized logical operators in R:

  • ! is used for not
  • & is used for and
  • ‘|’ is used for or
x <- 1:10; x  
x >=5       # which numbers in x are more or equal than 5

### The %% operator means remainder after division
(y<- 1:10 %% 2)
y == 0

x >=5 & y==0  # Both are true. Numers that are larger or equal
       # than 5 and have remainder zero


x >= 5| y ==0  # Only one is needs to be TRUE

EXERCISE 2 (Objects and basic operations)

Answer the following questions:

  1. Assign the numbers 1 to 1000 to a variable ‘x’ . The reciprocal of a number is obtained by dividing 1 over the number (1/number). Define y as the reciprocal of x.

  2. Calculate the inverse tangent (that is arctan) of y and assign the result to a variable w. Hint: take a look to the ?Trig help page to find a function that calculates the inverse tangent

  3. Assign to a variable z the reciprocal of the Tangent of w.

  4. Compare z and x using a logical statement. Compare the first element of x and the first element of z. Before running the command think about what we should expect.

  5. Note that not all elements are equal, eventhough if they seem to be. Now compare the elements using de function identical, Then, use the function all.equal. Again, first read about these functions using help.

  6. Most built-in functions do not apply vectorization by default. Try the following and compare the different results

  • mean(1:5)

  • mean(1,2,3,4,5)

  • mean(c(1,2,3,4,5))

Solutions

Go back to top

Go back to Index


Note: This script is based on the R Workshop created by Gustavo Robles, some exercises are based on Cotton, R. (2013), Learning R , O’Reilly