Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
r_workshop4 [2018/09/26 16:17]
shaun.turney [1.4 Test statistics and p-values]
r_workshop4 [2019/08/08 17:52] (current)
mariehbrice [Workshop 4: Linear models]
Line 9: Line 9:
 ====== Workshop 4: Linear models ====== ====== Workshop 4: Linear models ======
  
-Developed by: Catherine Baltazar, Bérenger Bourgeois, Zofia Taranu, Shaun Turney, ​William ​Vieira+Developed by: Catherine Baltazar, Bérenger Bourgeois, Zofia Taranu, Shaun Turney, ​Willian ​Vieira
  
 **Summary:​** In this workshop, you will learn how to implement basic linear models commonly used in ecology in R such as simple regression, analysis of variance (ANOVA), analysis of covariance (ANCOVA), and multiple regression. After verifying visually and statistically the assumptions of these models and transforming your data when necessary, the interpretation of model outputs and the plotting of your final model will no longer keep secrets from you! **Summary:​** In this workshop, you will learn how to implement basic linear models commonly used in ecology in R such as simple regression, analysis of variance (ANOVA), analysis of covariance (ANCOVA), and multiple regression. After verifying visually and statistically the assumptions of these models and transforming your data when necessary, the interpretation of model outputs and the plotting of your final model will no longer keep secrets from you!
  
-Link to associated Prezi: [[https://​prezi.com/​qk2xegtlj44b/​|Prezi]]+**Link to new [[https://​qcbsrworkshops.github.io/​workshop04/​workshop04-en/​workshop04-en.html|Rmarkdown presentation]]** 
 + 
 +Link to old [[https://​prezi.com/​qk2xegtlj44b/​|Prezi ​presentation]]
  
 Download the R script and data for this lesson: Download the R script and data for this lesson:
Line 103: Line 105:
 Below we will explore several kinds of linear models. The way you create and interpret each model will differ in the specifics, but the principles behind them and the general work flow will remain the same. For each model we will work through the following steps: Below we will explore several kinds of linear models. The way you create and interpret each model will differ in the specifics, but the principles behind them and the general work flow will remain the same. For each model we will work through the following steps:
  
-  - Plot the data +  - Visualize ​the data (data visualization could also come later in your work flow)
   - Create a model   - Create a model
   - Test the model assumptions   - Test the model assumptions
Line 160: Line 162:
 ^ AvgAbund | The average abundance across all sites\\ where found in NA|Continuous/​ numeric| ​ ^ AvgAbund | The average abundance across all sites\\ where found in NA|Continuous/​ numeric| ​
 ^ Mass     | The body size in grams| Continuous/ numeric| ^ Mass     | The body size in grams| Continuous/ numeric|
-^ Diet     | Type of food consumed| Discrete – 5 levels (Plant; PlantInsect;​\\ Insect; ​InserctVert; Vertebrate)|+^ Diet     | Type of food consumed| Discrete – 5 levels (Plant; PlantInsect;​\\ Insect; ​InsectVert; Vertebrate)|
 ^ Passerine| Is it a songbird/ perching bird| Boolean (0/1)| ^ Passerine| Is it a songbird/ perching bird| Boolean (0/1)|
 ^ Aquatic ​ | Is it a bird that primarily lives in/ on/ next to the water| Boolean (0/1)| ^ Aquatic ​ | Is it a bird that primarily lives in/ on/ next to the water| Boolean (0/1)|
Line 215: Line 217:
  
 <code rsplus | Testing Normality: hist() function>​ <code rsplus | Testing Normality: hist() function>​
-# Plot Y ~ X and the regression line 
 # Plot Y ~ X and the regression line # Plot Y ~ X and the regression line
 plot(bird$MaxAbund ~ bird$Mass, pch=19, col="​coral",​ ylab="​Maximum Abundance", ​ plot(bird$MaxAbund ~ bird$Mass, pch=19, col="​coral",​ ylab="​Maximum Abundance", ​
Line 689: Line 690:
 ==== 3.6 Complementary test ==== ==== 3.6 Complementary test ====
  
-Importantly,​ ANOVA cannot identify which treatment is different from the others in terms of response variable. To determine ​this, post-hoc tests that compare the levels of the explanatory variables (i.e. the treatments) two by two, must be performed. While several post-hoc tests exist (e.g. Fischer’s least significant difference, Duncan’s new multiple range test, Newman-Keuls method, Dunnett’s test, etc.), the Tukey’s range test is used in this example using the function ''​TukeyHSD''​ as follows:+Importantly,​ ANOVA cannot identify which treatment is different from the others in terms of response variable. It can only identify that a difference is present. To determine ​the location of the difference(s), post-hoc tests that compare the levels of the explanatory variables (i.e. the treatments) two by two, must be performed. While several post-hoc tests exist (e.g. Fischer’s least significant difference, Duncan’s new multiple range test, Newman-Keuls method, Dunnett’s test, etc.), the Tukey’s range test is used in this example using the function ''​TukeyHSD''​ as follows:
  
 <code rsplus| Post-hoc Tukey Test> <code rsplus| Post-hoc Tukey Test>
Line 1022: Line 1023:
 ==== 6.1 Assumptions ==== ==== 6.1 Assumptions ====
  
-As with models seen above, to be valid ANCOVA models must meet the statistical assumptions of linear models that can be verified using diagnostic plots, i.e.+As with models seen above, to be valid ANCOVA models must meet the statistical assumptions of linear models that can be verified using diagnostic plots. In addition, ​ANCOVA ​models must have:
-  - Normal distribution of the model residuals +
-  - Homoscedasticty of the residual variance  +
-  - Independence of the residuals +
-  - Equal variance between different levels of a given factor +
-In addition, ​ANOVA models must have:+
   - The same value range for all covariates   - The same value range for all covariates
   - Variables that are //fixed//   - Variables that are //fixed//
Line 1284: Line 1280:
 ---- ----
  
-CHALLENGE 7+**CHALLENGE 7** 
 + 
 +Compare the different polynomial models in the previous example, and determine which model is the most appropriate. Extract the adjusted R squared, the regression coefficients,​ and the p-values of this chosen model.
  
 ++++ Challenge 7: Solution| ++++ Challenge 7: Solution|