Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
r_workshop1 [2018/10/06 21:27]
mariehbrice [Workshop 1: Introduction to R]
r_workshop1 [2019/09/03 21:49] (current)
mariehbrice [Workshop 1: Introduction to R]
Line 12: Line 12:
 **Summary:​** In this introductory R Workshop you will learn what R open-source statistical software is, why you should absolutely start using it, and all the first steps to help you get started in R. We will show you how R can act as a calculator, teach you about the various types of objects in R, show you how to use functions and load packages, and find all the resources you need to get help. If any of this sounds obscure, don’t worry! By the end of this workshop you’ll know what all these words mean! **Summary:​** In this introductory R Workshop you will learn what R open-source statistical software is, why you should absolutely start using it, and all the first steps to help you get started in R. We will show you how R can act as a calculator, teach you about the various types of objects in R, show you how to use functions and load packages, and find all the resources you need to get help. If any of this sounds obscure, don’t worry! By the end of this workshop you’ll know what all these words mean!
  
-**Link to new [[https://​qcbsrworkshops.github.io/Workshops/​workshop01/​workshop01-en/​workshop01-en.html|Rmarkdown presentation]]**+**Link to new [[https://​qcbsrworkshops.github.io/​workshop01/​workshop01-en/​workshop01-en.html|Rmarkdown presentation]]**
  
 //Please try it and tell the R workshop coordinators what you think!// //Please try it and tell the R workshop coordinators what you think!//
  
 Link to old [[http://​prezi.com/​ygatqgksbaso/​qcbs-r-workshop-1/​|Prezi presentation]] Link to old [[http://​prezi.com/​ygatqgksbaso/​qcbs-r-workshop-1/​|Prezi presentation]]
 +
 +Download the R [[https://​github.com/​QCBSRworkshops/​workshop01/​blob/​dev/​workshop01-en/​ReferenceScriptWorkshop1.R|script]] for this lesson. ​
 ===== Installing R and R Studio ===== ===== Installing R and R Studio =====
  
Line 41: Line 43:
  
 ==== Using R Studio ==== ==== Using R Studio ====
 +-----
 R Studio is an integrated development environment (IDE) for R.  Basically, it's a place where you can easily use the R language, visualize tables and figures and even run all your statistical analyses. We recommend using it instead of the traditional command line as it provides great visual aid and a number of useful tools that you will learn more about over the course of this workshop. R Studio is an integrated development environment (IDE) for R.  Basically, it's a place where you can easily use the R language, visualize tables and figures and even run all your statistical analyses. We recommend using it instead of the traditional command line as it provides great visual aid and a number of useful tools that you will learn more about over the course of this workshop.
 \\ \\
Line 52: Line 55:
 \\ \\
  
-Note for Windows users: If the restriction:"​unable to write on disk" appears when you try to open R-Studio, right-click on your R-Studio icon and chose:"​Execute as administrator"​ to open the program.+**Note for Windows users**: If the restriction "​unable to write on disk" appears when you try to open R-Studio, right-click on your R-Studio icon and chose:"​Execute as administrator"​ to open the program.
  
 When you open R studio, the first thing that you see to the left of the screen is the "​console"​. This is where we will be working for the rest of this Introduction to R workshop. Text in the console typically looks like this: When you open R studio, the first thing that you see to the left of the screen is the "​console"​. This is where we will be working for the rest of this Introduction to R workshop. Text in the console typically looks like this:
 <file rsplus| Illustrating R console input and output> <file rsplus| Illustrating R console input and output>
-input +output 
-[1] output+[1] "This is the output"
 </​file>​ </​file>​
  
Line 70: Line 73:
  
 ==== R as a calculator ===== ==== R as a calculator =====
 +-----
 The first thing to know about the R console is that you can use it as a calculator. The first thing to know about the R console is that you can use it as a calculator.
 <file rsplus| Addition>​ <file rsplus| Addition>​
Line 82: Line 86:
  
 <file rsplus| Multiplication>​ <file rsplus| Multiplication>​
-> 2*2+> 2 * 2
 [1] 4 [1] 4
 </​file>​ </​file>​
  
 <file rsplus| Division>​ <file rsplus| Division>​
-> 8/2+> 8 / 2
 [1] 4 [1] 4
 </​file>​ </​file>​
Line 95: Line 99:
 [1] 8 [1] 8
 </​file> ​   </​file> ​  
 +\\ 
 +-----
 == CHALLENGE 2 == == CHALLENGE 2 ==
  
 Complete the following skill testing question in the R Studio console: 2+16x24-56 Complete the following skill testing question in the R Studio console: 2+16x24-56
  
-++Challenge 2: Solution| \\ ''>​ 2+16*24-56\\+++Challenge 2: Solution| \\ ''>​ 2 + 16 * 24 - 56\\
 [1] 330''​ ++ \\ [1] 330''​ ++ \\
 +-----
 == CHALLENGE 3 == == CHALLENGE 3 ==
  
Line 109: Line 114:
 2+16x24-56/​(2+1)-457 2+16x24-56/​(2+1)-457
  
-++Challenge 3: Solution| \\ ''>​ 2+16*24-56/​(2+1)-457\\+++Challenge 3: Solution| \\ ''>​ 2 + 16 * 24 - 56 / (2 + 1) - 457\\
 [1] -89.66667''​ ++ \\ [1] -89.66667''​ ++ \\
  
 Note that R //always// follows the order of priorities. Note that R //always// follows the order of priorities.
-\\ 
 ----- -----
 **R TIP**  ​ **R TIP**  ​
Line 123: Line 127:
 {{::​arrow_keys.png?​100|Use arrow keys to go back to previous commands.}} {{::​arrow_keys.png?​100|Use arrow keys to go back to previous commands.}}
 ----- -----
-\\ 
 == CHALLENGE 4 == == CHALLENGE 4 ==
  
 What is the area of a circle with a radius of 5 cm?  What is the area of a circle with a radius of 5 cm? 
  
-++Challenge 4: Solution| \\ Recall : $A_{circle} = \pi r^2$ \\ ''>​ 3.1416*5^2\\+++Challenge 4: Solution| \\ Recall : $A_{circle} = \pi r^2$ \\ ''>​ 3.1416 * 5^2\\
 [1] 78.54''​\\ [1] 78.54''​\\
  
 R also has many built in constants like pi.\\ R also has many built in constants like pi.\\
  
-''>​ pi*5^2\\+''>​ pi * 5^2\\
 [1] 78.54''​ [1] 78.54''​
 ++ \\ ++ \\
Line 144: Line 147:
  
 <file rsplus| Illustrating the concept of object> <file rsplus| Illustrating the concept of object>
-#Let's create an object called ​mean.x+# Let's create an object called ​mean_x
-#The # symbol is used in R to indicate comments. It is not processed by R. +# The # symbol is used in R to indicate comments. It is not processed by R. 
-#It is important to add comments to code so that it can be understood and used by other people. +# It is important to add comments to code so that it can be understood and used by other people. 
-mean.x ​<- (2+6)/2 +mean_x ​<- (2 + 6) / 2 
-#Typing its name will return its value. +# Typing its name will return its value. 
-mean.x +> mean_x 
-#! [1]  4+[1]  4
 </​file>​ </​file>​
  
-Here, ''​(2+6)/​2''​ is the value you want to save as an object. The identifier ''​mean.x''​ is assigned to this value. Typing ''​mean.x''​ returns the value of the calculation (//i.e.// 4). You have to be scrupulous when typing the identifier because R is case-sensitive:​ writing ''​mean.x''​ is not the same as writing ''​MEAN.X''​. You can see that the assignment operator ''​%%<​-%%''​ creates an explicit link between the value and the identifier. It always points from the value to the identifier. Note that it is also possible to use the equal sign ''​=''​ as the assignment operator but [[http://​stackoverflow.com/​questions/​1741820/​assignment-operators-in-r-and | it is preferable not]] to because it is also used for other operations in R, which can cause problems when using it for assignment. Finally, imagine that the operator ''​%%<​-%%''​ and ''​%%=%%''​ follow their own order of priorities.+Here, ''​(2 + 6) / 2''​ is the value you want to save as an object. The identifier ''​mean_x''​ is assigned to this value. Typing ''​mean_x''​ returns the value of the calculation (//i.e.// 4). You have to be scrupulous when typing the identifier because R is case-sensitive:​ writing ''​mean_x''​ is not the same as writing ''​MEAN_X''​. You can see that the assignment operator ''​%%<​-%%''​ creates an explicit link between the value and the identifier. It always points from the value to the identifier. Note that it is also possible to use the equal sign ''​=''​ as the assignment operator but [[http://​stackoverflow.com/​questions/​1741820/​assignment-operators-in-r-and | it is preferable not]] to because it is also used for other operations in R, which can cause problems when using it for assignment. Finally, imagine that the operator ''​%%<​-%%''​ and ''​%%=%%''​ follow their own order of priorities.
  
 <file rsplus| Order of priorities with assignment operator and equal sign > <file rsplus| Order of priorities with assignment operator and equal sign >
Line 165: Line 168:
  
 </​file>​ </​file>​
- +\\
 ----- -----
 **R TIP** **R TIP**
  
 Try choosing explicit names for your objects. It is good practice and allows you to understand quickly what the object represents. Naming an object ''​variable''​ or ''​data''​ isn't very informative! Try choosing explicit names for your objects. It is good practice and allows you to understand quickly what the object represents. Naming an object ''​variable''​ or ''​data''​ isn't very informative!
------ 
-\\ 
 ----- -----
 == CHALLENGE 5 == == CHALLENGE 5 ==
  
-> Create an object with a value of 1 + 1.718282 (Euler'​s number) and name it ''​euler.value''​+> Create an object with a value of 1 + 1.718282 (Euler'​s number) and name it ''​euler_value''​
 ++++ ++++
 Challenge 5: Solution| Challenge 5: Solution|
 <code rsplus> <code rsplus>
-euler.value ​<- 1 + 1.718282+> euler_value ​<- 1 + 1.718282
 </​code>​ </​code>​
 ++++ ++++
------ 
-\\ 
 ----- -----
 == CHALLENGE 6 == == CHALLENGE 6 ==
Line 194: Line 192:
 ''​unexpected symbol in %%"​%%your object name%%"​%%''​. ''​unexpected symbol in %%"​%%your object name%%"​%%''​.
 ++++ ++++
------ 
-\\ 
 ----- -----
 **R TIP** **R TIP**
  
-Using the Tab key allows auto-completion of names. It speeds up command entering and avoids spelling errors. For example, if you type ''​eu''​ and then press tab, you will see a list of objects or functions beginning with ''​eu''​. Select ''​euler.value''​ (the object we just created) and press enter. The ''​euler.value''​ identifier now appears at the command line.+Using the Tab key allows auto-completion of names. It speeds up command entering and avoids spelling errors. For example, if you type ''​eu''​ and then press tab, you will see a list of objects or functions beginning with ''​eu''​. Select ''​euler_value''​ (the object we just created) and press enter. The ''​euler_value''​ identifier now appears at the command line. 
 + 
 +==== Types of data structures in R ====
 ----- -----
-\\ 
-==== Types of data structures in R ==== 
- 
 Using R to analyse your data is an important aspect of this software. Data comes in different forms and can be grouped in distinct categories. Depending on the nature of the values enclosed inside your data or object, R classifies them accordingly. The following figure illustrates common objects found in R. Using R to analyse your data is an important aspect of this software. Data comes in different forms and can be grouped in distinct categories. Depending on the nature of the values enclosed inside your data or object, R classifies them accordingly. The following figure illustrates common objects found in R.
  
Line 211: Line 206:
  
 Before we look at how to create different types of vectors, let's have a look at the generic method of creating vectors. If you recall what you have just learned, you will first have to identify some value you want to put in a vector and then link it to an identifier with the assignment operator (//i.e.// create an object). When you have more than one value in a vector, you need a way to tell R to group all these values to create a vector. The trick here is to use the ''​c''​ function. Don't worry, you will learn about functions pretty soon in one of the following sections. For now, just remember to put your values ​ between parentheses next to letter ''​c''​ in this format: ''​vector.name %%<-%% c(value1, value2, value3, ...)''​. The function ''​c()''​ means combine or concatenate. It is a quick and easy function so remember it! Before we look at how to create different types of vectors, let's have a look at the generic method of creating vectors. If you recall what you have just learned, you will first have to identify some value you want to put in a vector and then link it to an identifier with the assignment operator (//i.e.// create an object). When you have more than one value in a vector, you need a way to tell R to group all these values to create a vector. The trick here is to use the ''​c''​ function. Don't worry, you will learn about functions pretty soon in one of the following sections. For now, just remember to put your values ​ between parentheses next to letter ''​c''​ in this format: ''​vector.name %%<-%% c(value1, value2, value3, ...)''​. The function ''​c()''​ means combine or concatenate. It is a quick and easy function so remember it!
 +\\
 +Now that you know the generic method to create a vector in R, let's have a look at how to create different types of vectors.
  
 +<file rsplus| Creating vectors in R>
 +# Create a numeric vector with the c (which means combine or concatenate) function.
 +# We will learn about functions soon!
 +> num_vector <- c(1, 4, 3, 98, 32, -76, -4)
 +
 +# Create a character vector. Always use ""​ to delimit text strings!
 +> char_vector <- c("​blue",​ "​red",​ "​green"​)
 +
 +# Create a logical or boolean vector. Don't use ""​ or R will consider this as text strings.
 +> bool_vector <- c(TRUE, TRUE, FALSE)
 +
 +#It is also possible to use abbreviations for logical vectors.
 +> bool_vector2 <- c(T, T, F)
 +</​file>​
 +\\
 ----- -----
 == CHALLENGE 7 == == CHALLENGE 7 ==
  
-> Create a vector containing the first five odd numbers (starting from 1) and name it odd.n.+> Create a vector containing the first five odd numbers (starting from 1) and name it odd_n.
 ++++Challenge 7: Solution| ++++Challenge 7: Solution|
 <code rsplus> <code rsplus>
-odd.n <- c(1,​3,​5,​7,​9)+> odd_n <- c(1, 3, 5, 7, 9)
 </​code>​ </​code>​
 ++++ ++++
 ----- -----
-\\ +**TIP**  ​
-Now that you know the generic method to create a vector in R, let's have a look at how to create different types of vectors. +
- +
-<file rsplus| Creating vectors in R> +
-#Create a numeric vector with the c (which means combine or concatenate) function. +
-#We will learn about functions soon! +
-num.vector<​-c(1,​2,​5,​3,​6,​-2,​4) +
-#Create a character vector. Always use ""​ to delimit text strings! +
-col.vector<​-c("​blue","​red","​green"​) +
-#Create a logical vector. Don't use ""​ or R will consider this as text strings. +
-logic.vector<​-c(TRUE,​TRUE,​FALSE) +
-#It is also possible to use abbreviations for logical vectors. +
-logic.vector2<​-c(T,​T,​F) +
-</​file>​+
  
-\\ +Use ''​dput()''​ function to obtain the reverse, //i.e.// the content of an object formatted as a vector. //e.g.// : 
------ +
-**Truc R**   +
- +
-Use ''​dput''​ function to obtain the reverse, //i.e.// the content of an object formatted as a vector. //e.g.// : +
 <code rsplus> <code rsplus>
-odd <- c(1, 3, 5, 7, 9) +odd_n <- c(1, 3, 5, 7, 9) 
-odd+odd_n
 [1] 1 3 5 7 9 [1] 1 3 5 7 9
  
-> dput(odd)+> dput(odd_n)
 c(1, 3, 5, 7, 9) c(1, 3, 5, 7, 9)
 +
 +# The output can be copied-pasted to create a new object by using the structure() function
 +> structure(c(1,​ 3, 5, 7, 9))
 </​code>​ </​code>​
  
-This demonstration might not be that convincing, but keep in mind that it can be very useful when you're manipulating data. The result returned by R with ''​%%dput%%''​ can be copied-pasted ​to create ​new object, since it's already formatted ​for R. On the contrary, the answer that R gives when typing ''​odd''​ is not directly usable since it's not in ''​c()''​ function and that the numbers are not separated by commas. ​+This demonstration might not be that convincing, but keep in mind that it can be very useful when you're manipulating data. These functions are really useful ​to provide ​reproductible example ​for a question on stackoverflow for instance ​(see one more application in the part about data frames!
 ----- -----
  
- +What you have learned previously ​about calculations ​is also valid for vectors: vectors can be used for calculations. The only difference is that when a vector has more than 1 element, the operation is applied on all elements of the vector. The following example clarifies this.
-What you have learned previously is also valid for vectors: vectors can be used for calculations. The only difference is that when a vector has more than 1 element, the operation is applied on all elements of the vector. The following example clarifies this.+
  
 <file rsplus| Calculations with vectors> <file rsplus| Calculations with vectors>
-#Create two numeric vectors. +# Create two numeric vectors. 
-x <- 1:5 +x <- 1:5 
-#An equivalent form is: x <- c(1:5). + 
-y <- 6 +# An equivalent form is: x <- c(1:5). 
-#Remember that the : symbol, when used with numbers, is the sequence operator. +y <- 6 
-#It tells R to create a series of numbers increasing by 1. +# Remember that the : symbol, when used with numbers, is the sequence operator. 
-#Equivalent to this is x <- c(1,​2,​3,​4,​5) +# It tells R to create a series of numbers increasing by 1. 
-#Let's sum both vectors. +# Equivalent to this is x <- c(1,​2,​3,​4,​5) 
-#6 is added to all elements of the x vector. + 
-x + y +# Let's sum both vectors. 
-#! [1]  7 8 9 10 11 +# 6 is added to all elements of the x vector. 
-#Let's multiply x by itself. +x + y 
-x * x +[1]  7 8 9 10 11 
-#! [1]  1 4 9 16 25 + 
-#It is the same thing as using exponents:​ +# Let's multiply x by itself. 
-x^2 +x * x 
-#! [1]  1 4 9 16 25+[1]  1 4 9 16 25 
 + 
 +# It is the same thing as using exponents:​ 
 +x^2 
 +[1]  1 4 9 16 25
 </​file>​ </​file>​
  
 Another important type of object you will use regularly is the data frame. A data frame is a group of vectors of the same length (//i.e.// the same number of elements). Columns are always variables and rows are observations,​ cases, sites or replicates. Different modes can be saved in different columns (but always the same mode in a column). It is in this format that ecological data are usually stored. The following example shows a fictitious dataset representing 4 sites where soil pH and number of plant species were recorded. There is also a "​Treatment"​ variable (fertilised or not). Let's have a look at the creation of a data frame. Another important type of object you will use regularly is the data frame. A data frame is a group of vectors of the same length (//i.e.// the same number of elements). Columns are always variables and rows are observations,​ cases, sites or replicates. Different modes can be saved in different columns (but always the same mode in a column). It is in this format that ecological data are usually stored. The following example shows a fictitious dataset representing 4 sites where soil pH and number of plant species were recorded. There is also a "​Treatment"​ variable (fertilised or not). Let's have a look at the creation of a data frame.
  
-Site_ID ​^ soil.pH ^ num.sp ​Treatment ​^+ID of the site ^ soil pH ^ # of species ​treatment ​^
 | A1.01 | 5.6 | 17 | Fertilised | | A1.01 | 5.6 | 17 | Fertilised |
 | A1.02 | 7.3 | 23 | Fertilised | | A1.02 | 7.3 | 23 | Fertilised |
Line 286: Line 287:
  
 <file rsplus| Creating a data frame> ​ <file rsplus| Creating a data frame> ​
-#We first start by creating vectors. +# We first start by creating vectors. 
-Site_ID<​-c("​A1.01","​A1.02","​B1.01","​B1.02"​) +> site_id ​<- c("​A1.01",​ "​A1.02",​ "​B1.01",​ "​B1.02"​) 
-soil.pH<​-c(5.6,​7.3,​4.1,​6.0) +> soil_pH ​<- c(5.6, 7.3, 4.1, 6.0) 
-num.sp<​-c(17,​23,​15,​7) +> num_sp ​<- c(17, 23, 15, 7) 
-Treatment<​-c("​Fert","​Fert","​No.Fert","​No.Fert") +> treatment ​<- c("​Fert",​ "​Fert",​ "No_fert", "No_fert") 
-#We then combine them to create a data frame with the data.frame function. + 
-my.first.df<​-data.frame(Site_ID,soil.pH,num.sp,Treatment+# We then combine them to create a data frame with the data.frame function. 
-#Visualise it! +my_df <- data.frame(site_idsoil_pHnum_sptreatment) 
-my.first.df+ 
 +# Visualise it! 
 +my_df
 </​file>​ </​file>​
 +\\
 +-----
 +** R TIP**
  
 +Here the ''​dput()''​ function in another example.
 +<code rsplus>
 +> dput(my_df)
 +structure(list(site_id = structure(1:​4,​ .Label = c("​A1.01",​ "​A1.02",​ "​B1.01",​ "​B1.02"​),​ class = "​factor"​),​ soil_pH = c(5.6, 7.3, 4.1, 6), num_sp = c(17, 23, 15, 7), treatment = structure(c(1L,​ 1L, 2L, 2L), .Label = c("​Fert",​ "​No_fert"​),​ class = "​factor"​)),​ class = "​data.frame",​ row.names = c(NA, -4L))
 + 
 +# It's possible to rebuild the initial data frame (with some associated metadata as the class of variables) by copying and pasting the previous output:
 +> structure(list(site_id = structure(1:​4,​ .Label = c("​A1.01",​ "​A1.02",​ "​B1.01",​ "​B1.02"​),​ class = "​factor"​),​
 +                 ​soil_pH = c(5.6, 7.3, 4.1, 6),
 +                 ​num_sp = c(17, 23, 15, 7),
 +                 ​treatment = structure(c(1L,​ 1L, 2L, 2L), .Label = c("​Fert",​ "​No_fert"​),​ class = "​factor"​)),​
 +class = "​data.frame",​ row.names = c(NA, -4L))
 +</​code>​
 +-----
 +\\
 Other types of objects include matrices, arrays and lists. A matrix is similar to a data frame except that all cells in the matrix must be the same mode. An array is similar to a matrix but can have more than two dimensions. Arrays are usually used for advanced computation like numerical simulations and permutation tests. A list is an aggregation of various types of objects. For example, a list could include a vector, a data frame and a matrix in the same object. Other types of objects include matrices, arrays and lists. A matrix is similar to a data frame except that all cells in the matrix must be the same mode. An array is similar to a matrix but can have more than two dimensions. Arrays are usually used for advanced computation like numerical simulations and permutation tests. A list is an aggregation of various types of objects. For example, a list could include a vector, a data frame and a matrix in the same object.
 ===== Indexing objects in R ===== ===== Indexing objects in R =====
Line 303: Line 323:
  
 <file rsplus| Indexing a vector> <file rsplus| Indexing a vector>
-#Let's first create a numeric and a character vector. +# Let's first create a numeric and a character vector. 
-#There is no need to do this again if you already did it in the previous exercise! +# There is no need to do this again if you already did it in the previous exercise! 
-num.vector<-c(1,2,5,3,6,-2,4+> odd_n <- c(1, 3, 5, 79
-col.vector<​-c("​blue","​red","​green"​) + 
-#Extract the third element of the numeric vector. +# Extract the second ​element of the numeric vector. 
-num.vector[3+> odd_n[2
-#! [1]  ​5 +[1]  ​
-#​Extract ​all but the third element ​of the numeric vector. + 
-num.vector[-3+# Extract the second and fourth elements ​of the numeric vector. 
-#! [1]  ​1 ​ 2  ​3 ​ ​6 ​ -2  4 +> odd_n[c(2, 4)
-#Extract the first and third elements of the character ​vector. +[1]  3 
-col.vector[c(1,3)] + 
-#! [1]  "​blue" ​ "​green"​ +# Extract ​all but the two first elements of the numeric ​vector. 
-#Extract ​the first and fourth elements of the character ​vector. +> odd_n[c(-1, -2)] 
-#There is no fourth ​value in this vector so R returns a null value (i.e. NA) +[1] 5 7 9 
-#NA stands for 'Not available'​. + 
-col.vector[c(1,​4)] +If you select a position that is not in the numeric ​vector 
-#! [1]  "​blue" ​ NA +> odd_n[c(1,​6)] 
-#Extract all values ​from the numeric vector greater than 5+[1] 1 NA 
-num.vector[num.vector>5+# There is no sixth value in this vector so R returns a null value (i.e. NA) 
-#! [1]  6 +# NA stands for 'Not available'​. 
-#Extract all elements of the character vector corresponding exactly to "​blue"​. + 
-#Note the use of the double equal sign ==. +You can use logical statement to select ​values. 
-col.vector[col.vector=="​blue"​] +> odd_n[odd_n 4
-#! [1]  "blue"+[1] 5 7 9 
 + 
 +# Extract all elements of the character vector corresponding exactly to "​blue"​. 
 +> char_vecteur[char_vecteur ​== "​blue"​] 
 +[1]  "bleu" 
 +# Note the use of the double equal sign ==. 
 </​file>​ </​file>​
 \\ \\
Line 333: Line 359:
 == CHALLENGE 8 == == CHALLENGE 8 ==
  
-> a) Extract the 4th value of the ''​num.vector''​ vector. +> a) Extract the 4th value of the ''​num_vector''​ vector. 
-> b) Extract the 1st and 3rd values of the ''​num.vector''​ vector. +> b) Extract the 1st and 3rd values of the ''​num_vector''​ vector. 
-> c) Extract all values of the ''​num.vector''​ vector excluding the 2nd and 4th values.+> c) Extract all values of the ''​num_vector''​ vector excluding the 2nd and 4th values.
 > >
 ++++Challenge 8a: Indexing vectors| ++++Challenge 8a: Indexing vectors|
 <code rsplus> <code rsplus>
-num.vector[4] +> num_vector[4] 
-#! [1] 3+[1] 3
 </​code>​ </​code>​
 ++++ ++++
 ++++Challenge 8b: Indexing vectors| ++++Challenge 8b: Indexing vectors|
 <code rsplus> <code rsplus>
-num.vector[c(1,3)] +> num_vector[c(1, 3)] 
-#! [1] 1  5+[1] 1  5
 </​code>​ </​code>​
 ++++ ++++
 ++++Challenge 8c: Indexing vectors| ++++Challenge 8c: Indexing vectors|
 <code rsplus> <code rsplus>
-num.vector[c(-2,​-4)] +> num_vector[c(-2, -4)] 
-#! [1] 1  5  6  -2  4+[1] 1  5  6  -2  4
 </​code>​ </​code>​
 ++++ ++++
------ 
-\\ 
 ----- -----
 == CHALLENGE 9 == == CHALLENGE 9 ==
Line 362: Line 386:
 >Explore the difference between these 2 lines of code: >Explore the difference between these 2 lines of code:
 <file rsplus| Differences between codes> <file rsplus| Differences between codes>
-col.vector ​== "​blue"​ +> char_vector ​== "​blue"​ 
-col.vector[col.vector == "​blue"​]+> char_vector[char.vector == "​blue"​]
 </​file>​ </​file>​
 ++++Challenge 9: Differences between codes| ++++Challenge 9: Differences between codes|
-In the first line of code you test a logical statement. For each entry in the "col.vector" vector, R checks whether the entry is exactly equal to "​blue"​ or not and returns a TRUE/FALSE answer. The next subsection introduces you to logical statements. In the second line of code you ask R to extract all values within the "col.vector" vector that are exactly equal to "​blue"​. It is also possible to extract the "​blue"​ value by assigning a logical value to each element of the vector. Of course, you have to know the position of the "​blue"​ value inside the vector.+In the first line of code you test a logical statement. For each entry in the "char_vector" vector, R checks whether the entry is exactly equal to "​blue"​ or not and returns a TRUE/FALSE answer. The next subsection introduces you to logical statements. In the second line of code you ask R to extract all values within the "char_vector" vector that are exactly equal to "​blue"​. It is also possible to extract the "​blue"​ value by assigning a logical value to each element of the vector. Of course, you have to know the position of the "​blue"​ value inside the vector.
 <file rsplus| Extracting values with logical indexing>​ <file rsplus| Extracting values with logical indexing>​
-col.vector[c(TRUE, FALSE, FALSE)] +> char_vector[c(TRUE, FALSE, FALSE)] 
-#! [1] "​blue"​ +[1] "​blue"​ 
-#We specify which value is true, +# We specify which value is true, 
-#i.e. the value we want R to return (the first one) +# i.e. the value we want R to return (the first one) 
-#which corresponds to "​blue"​.+# which corresponds to "​blue"​.
 </​file>​ </​file>​
 ++++ ++++
 ----- -----
 \\ \\
-For data frames, the concept of indexation is similar, but we usually have to specify two dimensions: the row and column numbers. The R syntax is\\ ''​dataframe[row number, column number]''​. Here are a few examples of data frame indexation. Note that the first four operations are also valid for indexing matrices.+For data frames, the concept of indexation is similar, but we usually have to specify two dimensions: the row and column numbers. The R syntax is\\ ''​dataframe[row number, column number]''​. ​\\Here are a few examples of data frame indexation. Note that the first four operations are also valid for indexing matrices.
  
 <file rsplus| Indexing a data frame> <file rsplus| Indexing a data frame>
-#Let's reuse the data frame we created earlier (my.first.df+# Let's reuse the data frame we created earlier (my_df
-#Extract the 1st row of the data frame +# Extract the 1st row of the data frame 
-my.first.df[1,] +> my_df[1, ] 
-#Extract the 3rd columm + 
-my.first.df[,3] +# Extract the 3rd columm 
-#Extract the 2nd element of the 4th column +> my_df[, 3] 
-my.first.df[2,4] + 
-#Extract lines 2 to 4 +# Extract the 2nd element of the 4th column 
-my.first.df[c(2:4),] +> my_df[2, 4] 
-#Extract the "Site ID" column by referring directly to its name + 
-#The dollar sign ($) allows such an operation! +# Extract lines 2 to 4 
-my.first.df$Site_ID +> my_df[c(2:4), ] 
-#Extract the "Site ID" and "Soil pH" variables + 
-my.first.df[,c("Site_ID","​soil.pH")]+# Extract the "site_id" column by referring directly to its name 
 +# The dollar sign ($) allows such an operation! 
 +> my_df$site_id 
 + 
 +# Extract the "site_id" and "soil_pH" variables 
 +> my_df[, c("site_id","​soil_pH")]
 </​file>​ </​file>​
-\\ 
------ 
-== CHALLENGE 10 == 
-> a) Extract the ''​num.sp''​ column from ''​my.first.df''​ and multiply its values by the first four values of the ''​num.vec''​ vector.\\ 
-> b) After that, write a statement that checks if the values you obtained are greater than 25. Refer to challenge 9 to complete this challenge.\\ 
-> 
-++++ Challenge 10a: Indexing and multiplying| 
-<code rsplus> 
-my.first.df$num.sp * num.vector[c(1:​4)] 
-#! [1] 17 46 75 21 
-</​code>​ 
-++++ 
  
-++++ Challenge 10b: Logical statement| +==== A quick note on logical statements ====
-<code rsplus>​ +
-(my.first.df$num.sp * num.vector[c(1:​4)]) > 25 +
-#! [1] FALSE TRUE TRUE FALSE +
-</​code>​ +
-+++++
 ----- -----
-\\ +gives you the possibility to test logical statements, //i.e.// to evaluate whether a statement is true or false. You can compare objects with the following logical operators:
-==== A quick note on logical statements ==== +
- +
-Challenge 9 and 10 briefly introduced ​R'​s ​possibility to test logical statements, //i.e.// to evaluate whether a statement is true or false. You can compare objects with the following logical operators:+
  
 ^ Operator ^ Description ^ ^ Operator ^ Description ^
Line 433: Line 442:
  
 <file rsplus| Testing logical statements>​ <file rsplus| Testing logical statements>​
-#First, let's create two vectors for comparison. +# First, let's create two vectors for comparison. 
-x2 <- c(1:5) +x2 <- c(1:5) 
-y2 <- c(1,​2,​-7,​4,​5) +y2 <- c(1, 2, -7, 4, 5)
-#Let's verify if the elements in x2 are greater or equal to 3. + 
-#R returns a TRUE/FALSE value for each element (in order). +# Let's verify if the elements in x2 are greater or equal to 3. 
-x2 >= 3 +# R returns a TRUE/FALSE value for each element (in order). 
-#! [1] FALSE FALSE TRUE TRUE TRUE +x2 >= 3 
-#Let's see if the elements of x2 are exactly equal to those of y2. +[1] FALSE FALSE TRUE TRUE TRUE 
-x2 == y2 + 
-#! [1] TRUE TRUE FALSE TRUE TRUE +# Let's see if the elements of x2 are exactly equal to those of y2. 
-#Is 3 not equal to 4? Of course! +x2 == y2 
-3 != 4 +[1] TRUE TRUE FALSE TRUE TRUE 
-#! [1] TRUE + 
-#Let's see which values in x2 are greater than 2 but smaller than 5. +# Is 3 not equal to 4? Of course! 
-#You have to write x2 twice. +3 != 4 
-#If you write x2 > 2 & < 5, you will get an error message. +[1] TRUE 
-x2 > 2 & x2 < 5 + 
-#! [1] FALSE FALSE TRUE TRUE FALSE+# Let's see which values in x2 are greater than 2 but smaller than 5. 
 +# You have to write x2 twice. 
 +# If you write x2 > 2 & < 5, you will get an error message. 
 +x2 > 2 & x2 < 5 
 +[1] FALSE FALSE TRUE TRUE FALSE
 </​file>​ </​file>​
 +\\
 +-----
 +== CHALLENGE 10 ==
 +> a) Extract the ''​num_sp''​ column from ''​my_df''​ and multiply its values by the first four values of the ''​num_vec''​ vector.\\
 +> b) After that, write a statement that checks if the values you obtained are greater than 25. Refer to challenge 9 to complete this challenge.\\
 +>
 +++++ Challenge 10a: Indexing and multiplying|
 +<code rsplus>
 +> my_df$num.sp * num.vector[c(1:​4)]
 +[1] 17  92  45 686
 +# or
 +> my_df[, 3] * num_vector[c(1:​4)]
 +[1]  17  92  45 686
 +
 +</​code>​
 +++++
 +
 +++++ Challenge 10b: Logical statement|
 +<code rsplus>
 +> (my_df$num.sp * num.vector[c(1:​4)]) > 25
 +[1] FALSE TRUE TRUE FALSE
 +</​code>​
 +++++
  
 ===== Functions ===== ===== Functions =====
Line 463: Line 499:
  
 To perform the function call you will need entry values called **arguments** (or sometimes parameters). To perform the function call you will need entry values called **arguments** (or sometimes parameters).
-After performing its operations, the function will then give you a return ​**value**.+After performing its operations, the function will then give you a **return ​value**.
 The command also must be structured properly, following the "​grammar rules" of the R language (syntax). The command also must be structured properly, following the "​grammar rules" of the R language (syntax).
  
Line 470: Line 506:
 ''//​function_name//​**(**arg1**,​** arg2**, ...)**''​ ''//​function_name//​**(**arg1**,​** arg2**, ...)**''​
  
-Ex:+Here an example:
 <file rsplus| Function syntax> <file rsplus| Function syntax>
-sum(1, 2)+sum(1, 2)
 </​file>​ </​file>​
  
Line 480: Line 516:
  
 <file rsplus| Objects as arguments>​ <file rsplus| Objects as arguments>​
-a <- 3 +a <- 3 
-b <- 4 +b <- 5 
-sum(a, b) +sum(a, b) 
-#! [1] 7+[1] 8
 </​file>​ </​file>​
  
-On the last line, the output that appears is the **return value** of the function. In this case, it is the sum of ''​a''​ and ''​b'', ​7.\\+On the last line, the output that appears is the **return value** of the function. In this case, it is the sum of ''​a''​ and ''​b'', ​8.\\
 \\ \\
- 
 ----- -----
 == CHALLENGE 11 == == CHALLENGE 11 ==
Line 502: Line 537:
 ++++Challenge 11a: Calling functions| ++++Challenge 11a: Calling functions|
 <file rsplus Challenge11a>​ <file rsplus Challenge11a>​
-a <- 1:5 +a <- 1:5 
-b <- 2 +b <- 2 
-result_add <- a + b +result_add <- a + b 
-result_sum <- sum(a, b)+result_sum <- sum(a, b)
 </​file>​ </​file>​
 <code rsplus> <code rsplus>
-result_add +result_add 
-#! [1]  3 4 5 6 7+[1]  3 4 5 6 7
 </​code>​ </​code>​
 The operation on the vector adds 2 to each element. The result is a **vector**. \\ The operation on the vector adds 2 to each element. The result is a **vector**. \\
  
 <code rsplus> <code rsplus>
-result_sum +result_sum 
-#! [1] 17+[1] 17
 </​code>​ </​code>​
 The function ''​sum()''​ adds all values of ''​a''​ and ''​b''​. It is the same as doing 1 + 2 + 3 + 4 + 5 + 2. The result is a **number**.\\ The function ''​sum()''​ adds all values of ''​a''​ and ''​b''​. It is the same as doing 1 + 2 + 3 + 4 + 5 + 2. The result is a **number**.\\
Line 522: Line 557:
 ++++Challenge 11b: Calling functions| ++++Challenge 11b: Calling functions|
 <code rsplus> <code rsplus>
-sum(result_sum,​ 5) +sum(result_sum,​ 5) 
-#! [1] 22+[1] 22
 </​code>​ </​code>​
 ++++ ++++
Line 536: Line 571:
  
 <file rsplus| Argument name> <file rsplus| Argument name>
-log(8, base=2)+log(8, base = 2)
 </​file>  ​ </​file>  ​
 \\ \\
- 
 ----- -----
 == CHALLENGE 12 == == CHALLENGE 12 ==
Line 546: Line 580:
  
 <file rsplus| Challenge 12> <file rsplus| Challenge 12>
-a <- 1:100 +a <- 1:100 
-b <- a^2 +b <- a^2 
-plot(a, b) +plot(a, b) 
-plot(b, a) +plot(b, a) 
-plot(x=a, y=b) +plot(x = a, y = b) 
-plot(y=b, x=a)+plot(y = b, x = a)
 </​file>​ </​file>​
  
Line 564: Line 598:
 Plots the graph of ''​a''​ as a function of ''​b''​.The argument names are not provided, the order of the arguments matters. Plots the graph of ''​a''​ as a function of ''​b''​.The argument names are not provided, the order of the arguments matters.
  
-<code rsplus>​plot(x=a,​ y=b)</​code>​+<code rsplus>​plot(x = a, y = b)</​code>​
 {{:​pltxayb.png?​200|}}\\ {{:​pltxayb.png?​200|}}\\
 Plots the graph of ''​b''​ as a function of ''​a'',​ same as plot(a, b). Plots the graph of ''​b''​ as a function of ''​a'',​ same as plot(a, b).
  
-<code rsplus>​plot(y=b,​ x=a)</​code>​+<code rsplus>​plot(y = b, x = a)</​code>​
 {{:​plotybxa.png?​200|}}\\ {{:​plotybxa.png?​200|}}\\
 Plots the graph of ''​b''​ as a function of ''​a''​. The argument names are provided, the order of the arguments does not matter. Plots the graph of ''​b''​ as a function of ''​a''​. The argument names are provided, the order of the arguments does not matter.
Line 574: Line 608:
 ----- -----
 \\ \\
- 
 As a reference, here is a list of some of the most common R functions: As a reference, here is a list of some of the most common R functions:
  
Line 583: Line 616:
 help (or ?), help.search (or ??), help.start help (or ?), help.search (or ??), help.start
 </​code>​ </​code>​
-\\ 
  
 ==== Packages ==== ==== Packages ====
- +-----
 Packages are a grouping of functions and/or datasets that share a similar theme. Ex : statistics, spatial analysis, plotting... Packages are a grouping of functions and/or datasets that share a similar theme. Ex : statistics, spatial analysis, plotting...
  
Line 599: Line 630:
  
 <code rsplus> <code rsplus>
-install.packages("​ggplot2"​)+install.packages("​ggplot2"​)
 </​code>​ </​code>​
  
Line 606: Line 637:
  
 <code rsplus> <code rsplus>
-qplot(1:10, 1:10)+qplot(1:10, 1:10)
 </​code>​ </​code>​
  
Line 616: Line 647:
  
 <code rsplus> <code rsplus>
-library("​ggplot2"​) +library("​ggplot2"​) 
-qplot(1:10, 1:10)+qplot(1:10, 1:10)
 </​code>​ </​code>​
  
Line 625: Line 656:
 It is good practice to unload packages once we are done with them because it might conflict with other packages. Unloading a package is done with the ''​detach()''​ function and by specifying that it is a package: It is good practice to unload packages once we are done with them because it might conflict with other packages. Unloading a package is done with the ''​detach()''​ function and by specifying that it is a package:
 <file rsplus| Unloading a package> <file rsplus| Unloading a package>
-detach(package:​ggplot2)+detach(package:​ggplot2)
 </​file>​ </​file>​
 ===== Getting help and additional resources ===== ===== Getting help and additional resources =====
  
 ==== Getting help with functions ==== ==== Getting help with functions ====
- +-----
 We've seen so far that R is really great and offers us a lot of functions to work with. Among all these functions, there are probably some that can do what we want. We've seen so far that R is really great and offers us a lot of functions to work with. Among all these functions, there are probably some that can do what we want.
  
Line 640: Line 670:
  
 <code rsplus Searching for a function>​ <code rsplus Searching for a function>​
-??sequence+??sequence
 </​code>​ </​code>​
  
Line 664: Line 694:
  
 <code rsplus Finding help> <code rsplus Finding help>
-?seq+?seq
 </​code>​ </​code>​
  
Line 681: Line 711:
   * **See Also**: Related functions that can sometimes be of use, especially when searching for the correct function for our needs.   * **See Also**: Related functions that can sometimes be of use, especially when searching for the correct function for our needs.
   * **Examples**:​ Some examples on how to use the function(s)   * **Examples**:​ Some examples on how to use the function(s)
- 
 \\ \\
 ----- -----
Line 695: Line 724:
 ++++ Challenge 13a | ++++ Challenge 13a |
 <code rsplus> <code rsplus>
-seq(from=0, to=10, by=2) +seq(from = 0, to = 10, by = 2) 
-#! [1] 0 2 4 6 8 10+[1] 0 2 4 6 8 10
 </​code>​ </​code>​
  
Line 702: Line 731:
  
 <code rsplus> <code rsplus>
-seq(0, 10, 2) +seq(0, 10, 2) 
-#! [1] 0 2 4 6 8 10+[1] 0 2 4 6 8 10
 </​code>​ </​code>​
 ++++ ++++
Line 709: Line 738:
 ++++ Challenge 13b | ++++ Challenge 13b |
 <code rsplus> <code rsplus>
-numbers <- c(4, 55, 6, 22, 3+numbers <- c(2, 4, 22, 6, 26
-sort(numbers,​ decreasing=TRUE) +sort(numbers,​ decreasing = TRUE) 
-#! [1]  ​55 22 6 4 3+[1]  ​26 22    2
 </​code>​ </​code>​
 ++++ ++++
------ 
-\\ 
  
 ==== Getting help on the Web ==== ==== Getting help on the Web ====
 +-----
 Usually, your best source of information will be your favorite search engine (Google, Bing, Yahoo, etc.) Usually, your best source of information will be your favorite search engine (Google, Bing, Yahoo, etc.)
  
Line 729: Line 756:
  
 \\ \\
- 
 ----- -----
 == Challenge 14 == == Challenge 14 ==
Line 745: Line 771:
 d) ''​ls''​ \\ d) ''​ls''​ \\
 ++++ ++++
 +==== Some useful books on R ====
 ----- -----
-\\ 
-==== Some useful books on R ==== 
 Dalgaard, P. - Introductory Statistics with R.\\ Dalgaard, P. - Introductory Statistics with R.\\
 Zuur, A.F., Ieno, E.N. & Meesters, E. - A Beginner'​s Guide to R.\\ Zuur, A.F., Ieno, E.N. & Meesters, E. - A Beginner'​s Guide to R.\\
Line 755: Line 780:
  
 ==== Some useful websites ==== ==== Some useful websites ====
 +-----
 http://​stats.stackexchange.com/​ \\ http://​stats.stackexchange.com/​ \\
 https://​www.zoology.ubc.ca/​~schluter/​R/​ \\ https://​www.zoology.ubc.ca/​~schluter/​R/​ \\
Line 763: Line 789:
  
 ==== R script reference ====  ==== R script reference ==== 
 +-----
 Want to revise/​practice the material seen here at home? Want to revise/​practice the material seen here at home?
  
 [[{}{ :​referencescriptworkshop1.r }|Download R script]] [[{}{ :​referencescriptworkshop1.r }|Download R script]]