r_symposium [CSBQ-QCBS Wiki]

This is an old revision of the document!

Le réseau étudiant du CSBQ organise pour la première fois en 2017 un colloque/retraite de deux jours sur l'utilisation du logiciel R en science de la biodiversité. L'objectif de ce colloque est de présenter plusieurs méthodes avancées non couvertes par la série régulière d'ateliers en R du CSBQ. Le colloque se déroulera à la réserve naturelle Gault de l'Université McGill, à Mont-Saint-Hilaire en Montérégie. Ce colloque permettra à la communauté d'utilisateurs de R de tous les géopôles du CSBQ d'interagir dans un contexte convivial tout en se familiarisant avec plusieurs méthodes quantitatives ou informatiques de pointe en écologie. L'hébergement en chalet (dortoir), le service de traiteur et le transport depuis Montréal seront fournis aux participants du colloque. Le nombre de participants est limité à 32, soit la capacité d'accueil des chalets. Bien que le colloque soit gratuit, nous demandons un dépôt de 40$ remboursable sur place ou en cas d'annulation au moins deux semaines avant la tenue de l'évènement. Le programme du colloque de l'année en cours est détaillé ci-bas. Veuillez noter que les ateliers se dérouleront principalement en anglais.

For the first time, the QCBS student network will hold in 2017 a 2-day symposium on the use of the R computing language in biodiversity science. The objective of this symposium is to present several advanced methods that are not covered during the regular series of QCBS R workshops. The symposium will take place at McGill's Gault Nature Reserve in Mont-Saint-Hilaire, Montérégie. This event will allow the community of R users from all QCBS geopoles to exchange in a casual setting while also learning several current quantitative and computational methods in ecology. Housing (dorms), food, and transport from Montreal will be provided for all participants. The number of participants is limited to 32 based on the housing capacity of the cottages at the reserve. Although the symposium is free, we ask for a 40$ deposit refundable at the event or if cancellation occurs at least two weeks prior to the event. The program for the current edition of the symposium is detailed below.

Questions? Vincent Fugère - Dalal Hanna - Krista Oke

Registration: http://qcbs.ca/training/qcbs-2017-r-symposium-registration/

April 24th

10h00: Departure from McGill
11h00: Arrival at Gault
12h00: Lunch
13h00: Intro bayesian inference (Marc-Olivier Beausoleil & Max Farrell)
15h00: Coffee break
15h20: Intro to gene expression analysis (Sébastien Renaut)
17h30: Dinner
18h30: Open Science, Github & Markdown (Monica Granados)

April 25th

8h00: Breakfast
9h00: Predicting species geographical distributions (Julia Nordlund & Pedro Henrique Pereira Braga)
10h30: Break
10h45: Joint modelling (Guillaume Blanchet)
12h45: Lunch
13h30: Making an R package (Tyler Moulton)
15h45: Departure from Gault

Abstracts

The Bayesian Biologist: You are probably more Bayesian than you think by Max Farrell & Marc-Olivier Beausoleil. Jump with us into the world of probabilities with a workshop on Bayesian inference. We are going to explore this statistical framework with simple and meaningful examples for biologists. We plan to guide you through some theory, history, applications, and get your hands dirty with some code. At the end of the workshop, you’ll be convinced that Bayesian statistics are a super powerful framework to interpret the world, and get a taste of the ways you might implement them in your own research. You have an idea of things you want us to discuss? Fill this survey: https://goo.gl/forms/By3aMFtNaxLJ2ICB2 or email Max or Marco, we would like to hear your ideas!

Intro to gene expression analysis in R by Sébastien Renaut. Next generation sequencing has promised cheap DNA sequences to the masses. While this may be true, the bottleneck has now shifted from generating data to analyzing it. Here, I will use transcriptome sequencing data (RNAseq) to quantify gene expression. I will introduce data formats commonly used in genomics (e.g.: .fastq,.bam,.sam) and I will use the R programming language to identify differentially expressed genes (e.g. DESeq2, edgeR packages), cluster samples based on gene expression, detects gene ontology categories which are over/under represented (goseq) and present various graphics to illustrate results.

Open Science and Reproducibility in R by Monica Granados. Imagine if every paper you ever publish from now on could be reproduced by anyone around the world. Or a platform that gives you the power to integrate new data seamlessly into a manuscript complete with text and figures. In this workshop, we will be covering how to work in the open using R, R Markdown and GitHub. These three open platforms allow us to host data, analyze, visualize and produce a manuscript in one reproducible workflow. You will learn how to set up a repository in GitHub and manage branches, draw data from GitHub into R, write an R Markdown script for your manuscript and how to upload the R Markdown script into GitHub for reproducibility. The advantages of open, reproducible science are many. When working collaboratively, reproducible workflows allow collaborators to contribute simultaneously to the project with version control to preserve different iterations of the project. Working in the open also allows you share your research more widely, facilitating collaborative opportunities. At the end of the workshop we will also discuss the wider movement of open science, how it is helping breakdown economic barriers in science and education and how you can contribute.

Predicting species geographical distribution using R by Pedro Henrique P. Braga and Julia Nordlund. Species distribution models (SDM) have been widely applied to address many questions in biology, such as in the domains of ecology, evolution, biogeography and conservation. Applications are numerous and may include projecting potential impacts of climate change, predicting species invasions, conservation planning, addressing questions of ecological niche evolution, and estimating potential disease spread. Along with the increase in popularity of species distribution models, many methods and tools have been developed throughout the last decades. Most of these tools are now available within R packages. This course introduces fundamental concepts underpinning species distribution models, describing some of the most prominent methods currently in use, and discussing the strengths and limitations of these models for different applications.

Joint modelling by Guillaume Blanchet. Natural systems are complex and understanding them is a challenging task. In recent years, there has been an explosion in the amount of data that were gathered and made available that can potentially increase our knowledge of why and how species distribute as they do. It is now possible to obtain highly precise environmental and habitat characteristics for large areas of the world, traits are now available for a wealth of species and it is now possible to obtain high quality phylogenies for large groups of species. But how can we link these data together to better understand and predict the distribution of multiple species in a single model? In recent years, joint species distribution models (JSDMs) have emerged as an attractive way to approach such question. In this workshop, I will show you how to construct JSDMs using Bayesian hierarchical models. I will also briefly discuss the concept behind hierarchical models and how they can be used in a community ecology context.

R-Package development using ‘roxygen2’ and ‘devtools’ by Tyler Moulton. The QCBS community is rife with brilliant R programmers. Currently, most of the QCBS workshop series is devoted to teaching attendees how to use R and popular R packages. Many QCBS members, however, go further and develop their own custom functions tailored to their projects. Aggregating these functions into packages can be extremely useful. Packages can be easily shared with other researchers who wish to conduct similar analyses. They also improve methodological transparency and repeatability. Finally, publishing an R package on CRAN and/or git-hub is a nice accomplishment to have on your CV. I’d like to create and give a workshop on R package development using two packages which greatly simplify the process: ‘roxygen2’ and ‘devtools’. During the workshop, I will walk participants through the development of a simple package that conforms to CRAN submission standards, including proper documentation, package imports/dependencies, DESCRIPTION files, example data, version control with git, and how to check and submit your package to CRAN. I have spent the past few months struggling through teaching myself package development, and would love to give this workshop to spare others (some of) the headaches associated with this process.

Schedule

Abstracts