Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
r_workshop5 [2018/11/07 21:02]
katherinehebert [Iteration]
r_workshop5 [2019/08/08 17:55] (current)
mariehbrice [Workshop 5: Programming in R]
Line 13: Line 13:
 **Summary:​** This workshop focuses on basic programming in R.  In this workshop, you will learn how to use control flow (for loops, if, while) methods to prevent code repetition, facilitate organization and run simulations. ​ In addition, you will learn to write your own functions, and tips to program efficiently. The last part of the workshop will discuss packages that will not covered elsewhere in this workshop series, but that may be of interest to participants. **Summary:​** This workshop focuses on basic programming in R.  In this workshop, you will learn how to use control flow (for loops, if, while) methods to prevent code repetition, facilitate organization and run simulations. ​ In addition, you will learn to write your own functions, and tips to program efficiently. The last part of the workshop will discuss packages that will not covered elsewhere in this workshop series, but that may be of interest to participants.
  
-**Link to new [[https://​qcbsrworkshops.github.io/Workshops/​workshop05/​workshop05-en/​workshop05-en.html|Rmarkdown presentation]]**+**Link to new [[https://​qcbsrworkshops.github.io/​workshop05/​workshop05-en/​workshop05-en.html|Rmarkdown presentation]]**
  
 Link to old [[https://​prezi.com/​xuu2rphp5wg4/​|Prezi presentation]] Link to old [[https://​prezi.com/​xuu2rphp5wg4/​|Prezi presentation]]
Line 289: Line 289:
 </​code>​ </​code>​
  
-Tip1. To loop over the number of rows of a data frame, we can use the function ''​nrow()''​+**Tip 1.** To loop over the number of rows of a data frame, we can use the function ''​nrow()''​
  
 <code rsplus> <code rsplus>
Line 297: Line 297:
 </​code>​ </​code>​
  
-Tip2. If we want to perform operations on the elements of one column, we can directly iterate over it+**Tip 2.** If we want to perform operations on the elements of one column, we can directly iterate over it
  
 <code rsplus> <code rsplus>
Line 305: Line 305:
 </​code>​ </​code>​
  
-The expression within the loop can be almost anything and is usually a compound statement containing many commands. +**Tip 3.** The expression within the loop can be almost anything and is usually a compound statement containing many commands.
- +
-<code rsplus>​ +
-for (i in 4:5) { # for i in 4 to 5 +
-  print(colnames(CO2)[i]) ​  +
-  print(mean(CO2[,​i])) # print the mean of that column from the CO2 dataset +
-+
-</​code>​ +
- +
-Output:+
  
 <code rsplus> <code rsplus>
Line 334: Line 325:
 } }
 </​code>​ </​code>​
- 
-<code rsplus> 
-# Output 
- 
-for (i in 1:3) { 
-  for (n in 1:3) { 
-    print (i*n) 
-  } 
-} 
-</​code>​ 
- 
  
 ==== Getting good: using the ''​apply()''​ family ====  ==== Getting good: using the ''​apply()''​ family ==== 
Line 357: Line 337:
                  nrow = 5,                   nrow = 5, 
                  ncol = 4))                  ncol = 4))
- 
 } }
- 
 apply(X = height, ​ apply(X = height, ​
       MARGIN = 1,        MARGIN = 1, 
-      FUN = mean) +      FUN = mean)     
-      ​ +?​apply ​      ​
-?​apply ​  +
-      ​+
 </​code>​ </​code>​
  
 ==== apply() ====  ==== apply() ==== 
  
-<code rsplus> + 
-lapply() applies a function to every element of a list.+''​lapply()'' ​applies a function to every element of a **list**.
  
 It may be used for other objects like dataframes, lists or vectors. It may be used for other objects like dataframes, lists or vectors.
  
-The output returned is a list (explaining the “l” in lapply) and has the same number of elements as the object passed to it. +The output returned is a **list** (explaining the “l” in lapply) and has the same number of elements as the object passed to it.
  
 +<code rsplus>
 SimulatedData <- list( SimulatedData <- list(
   SimpleSequence = 1:4,    SimpleSequence = 1:4, 
Line 412: Line 388:
  
 # Apply mean to each element of the list  # Apply mean to each element of the list 
-sapply(SimulatedData,​ mean) 
-<​code/>​ 
- 
-<code rsplus> 
-SimulatedData <- list(SimpleSequence = 1:4,  
-             ​Norm10 = rnorm(10), ​ 
-             ​Norm20 = rnorm(20, 1),  
-             ​Norm100 = rnorm(100, 5)) 
- 
-# Output 
 sapply(SimulatedData,​ mean) sapply(SimulatedData,​ mean)
 </​code>​ </​code>​
Line 439: Line 405:
 </​code> ​ </​code> ​
  
-<code rsplus> ​ 
-lilySeeds <- c(80, 65, 89, 23, 21) 
-poppySeeds <- c(20, 35, 11, 77, 79) 
- 
-# Output 
-mapply(sum, lilySeeds, poppySeeds) 
-</​code> ​ 
  
 ==== tapply() ====  ==== tapply() ==== 
Line 534: Line 493:
 print(count) # The count and print command were performed 42 times. print(count) # The count and print command were performed 42 times.
 </​code>​ </​code>​
- 
-count <- 0 
-  
-<code rsplus> 
-for (i in 1:​nrow(CO2)) { 
-  if (CO2$Treatment[i] == "​nonchilled"​) next  
-  # Skip to next iteration if treatment is nonchilled 
-  count <- count + 1 
-} 
-print(count) # The count and print command were performed 42 times. 
-</​code> ​ 
  
 <code rsplus> ​ <code rsplus> ​
Line 571: Line 519:
 This could also be written using a ''​while''​ loop: This could also be written using a ''​while''​ loop:
 <code rsplus> ​ <code rsplus> ​
- 
 i <- 0 i <- 0
 count <- 0 count <- 0
Line 647: Line 594:
 </​code> ​ </​code> ​
  
------------ 
-<code rsplus> 
-plot(x=CO2$conc,​ y=CO2$uptake,​ type="​n",​ cex.lab=1.4,​ xlab="​CO2 Concentration",​ ylab="​CO2 Uptake"​) # Type "​n"​ tells R to not actually plot the points. 
-  
-for (i in 1:​length(CO2[,​1])) { 
-  if (CO2$Type[i] == "​Quebec"​ & CO2$Treatment[i] == "​nonchilled"​) { 
-    points(CO2$conc[i],​ CO2$uptake[i],​ col = "​red"​) 
-  } 
-  if (CO2$Type[i] == "​Quebec"​ & CO2$Treatment[i] == "​chilled"​) { 
-    points(CO2$conc[i],​ CO2$uptake[i],​ col = "​blue"​) 
-  } 
-  if (CO2$Type[i] == "​Mississippi"​ & CO2$Treatment[i] == "​nonchilled"​) { 
-    points(CO2$conc[i],​ CO2$uptake[i],​ col = "​orange"​) 
-  } 
-  if (CO2$Type[i] == "​Mississippi"​ & CO2$Treatment[i] == "​chilled"​) { 
-    points(CO2$conc[i],​ CO2$uptake[i],​ col = "​green"​) 
-  } 
-} 
-</​code> ​ 
  
 ==== Challenge 4 ====  ==== Challenge 4 ==== 
Line 989: Line 917:
 Proper indentation and spacing is the first step to get an easy to read code: Proper indentation and spacing is the first step to get an easy to read code:
  
-Use spaces between and after you operators; +  * Use spaces between and after you operators 
-Use consistentely ​the same assignation operator. <- is often preferred. = is OK, but do not switch all the time between the two; +  ​* ​Use consistently ​the same assignation operator. <- is often preferred. = is OK, but do not switch all the time between the two 
-Use brackets when using flow control statements:​ +  ​* ​Use brackets when using flow control statements:​ 
- +    ​* ​Inside brackets, indent by *at least* two spaces; 
-Inside brackets, indent by *at least* two spaces; ​Put closing brackets on a separate line, except when preceding an `else` statement.  +    * Put closing brackets on a separate line, except when preceding an `else` statement.  
-Define each variable on its own line.+  ​* ​Define each variable on its own line.
  
-code is not spaced. All brackets are in the same line, and it looks "​messy"​. ​+This code is not spaced, and therefore hard to read. All brackets are badly aligned, and it looks "​messy"​. ​
 <code rsplus> <code rsplus>
 a<-4;b=3 a<-4;b=3
Line 1004: Line 932:
 </​code> ​ </​code> ​
  
-it looks more organized, no?+This looks more organized, no?
  
 <code rsplus> <code rsplus>