Introduction to Open Source Geographic information systems with QGIS and GRASS

Quebec Centre for Biodiversity Science, https://qcbs.ca

Guillaume Larocque, Research Professional. (guillaume.larocque@mcgill.ca)

November 12 and 13, 2019. McGill University

QGIS 3 and GRASS 7 will be used for this workshop. Please make sure that you have the appropriate versions installed prior to the workshop.

Install QGIS and GRASS following these instructions.

GRASS

The Geographic Resources Analysis Support System (GRASS) is one of the oldest Geographic Information Systems still in existence today. It was originally developed in the early 1980s by the US Army as a software management tool for military applications. What was originally a command-line software running on Unix systems has now evolved into a multi-platform open source system with capabilities for raster and vector display and analysis, image processing, remote sensing, database management, and more. In recent years, GRASS became part of the Open Source Geospatial Foundation (OSGeo), a not-for-profit organization whose mission is to support and promote the collaborative development of open geospatial technologies and data. This foundation also supports multiple other projects for web mapping and desktop GIS, as well as various geospatial libraries for spatial data conversion, storage and interoperability. One of the projects supported by OSGeo is Quantum GIS (QGIS), a user-friendly, multi-platform, raster and vector desktop GIS which provides a point-and-click interface with functionalities for displaying, editing and analysing geospatial data. These functionalities can be greatly expanded through plugins that are either part of QGIS or that are supplied by third-parties. An interesting aspect of QGIS is the integration with GRASS, allowing the user to perform a number of GRASS operations from within QGIS. The combination of varied plugins and GRASS support turn QGIS into a full-fledge GIS with a strong analytical toolbox.

List of other Open Source or free GIS desktop programs of potential interest:

• uDIG - An interesting and user friendly GIS written in Java for data access, editing, and viewing. It is under active development at the moment.
• SAGA GIS - (System for Automated Geoscientific Analyses) A powerful GIS mostly aimed at data analysis and spatial algorithms.
• Whitebox GAT - Powerful software for geo-spatial analysis, modeling and remote sensing.
• ORFEO toolbox - Open source platform for remote sensing.
• ILWIS Open - A remote sensing and GIS software which integrates image, vector and thematic data in one package on the desktop.
• gvSIG - A GIS currently in development having a user-friendly interface, being able to access the most common formats, and featuring a wide range of tools for query, layout creation, geoprocessing, networks, etc.
• OpenJUMP GIS - The current version of OpenJUMP can read and write shapefiles other simple files, has limited support for the display of images and good support for showing data retrieved from web-services. It can be used as a GIS Data Viewer, or for editing geometry and attribute data.

Note: You can compare the possibilities offered by modern day GIS with the first ever GIS, created in 1965 for the Canada Land Inventory. See these videos on Youtube: Data for decision Part I, Part II et Part III.

Click here to download a ZIP file containing all the files needed for the exercises below.

Exercise 1 - Digitizing and map display

Ojective: Create a map of the two McGill forests in the Montreal West Island: The Morgan Arboretum and the Molson Reserve. Draft a plan for the expansion of these reserves and create nice maps outlining your plan.

Step 1: Open a text editor (for example, Notepad in Windows or TextEdit on Mac) and copy the following latitude , longitude coordinates in decimal degrees to it.

ID,Name,Long,Lat
1,MorganArboretum,-73.9541,45.4303
2,MolsonReserve,-73.9763,45.3943

Then save as a text file with a .csv extension and add this as a vector layer to the QGIS map canvas with the >Layer>'Add Delimited Text Layer'. Use Longitude as the X column and Latitude as the Y column. Specify >Geographic coordinate systems>WGS 84 as the CRS. Now, you should see two points on your screen indicating the center of each reserve.

Step 2: Add a Google Hybrid layer by clicking on plugins and by installing the QuickMapServices plugin. Then go in web>QuickMapServices>Settings>More services and click on Add contributed pack. Then, add the Google Hybrid layer to the canvas from the web>QuickMapServices menu.

Step 2 alternative: Add a Google Satellite layer by clicking on the Browser>XYZ tiles, right-click and select New connection. For the name, specify Google Satellite. For the URL, put

http://mt0.google.com/vt/lyrs=y&hl=en&x={x}&y={y}&z={z}&s=Ga

Note that the CRS of the canvas is now (WGS84/ Pseudo Mercator). Move this layer to the bottom in the Layers list. You can find other sources of tiles here (https://leaflet-extras.github.io/leaflet-providers/preview/)

Step 3: Click on the CRS modification button on the bottom right of your screen and change the CRS to NAD83/UTM Zone 18N.

Step 5: When done, exit Edit mode (click on pencil icon) and choose to save the layer.

Step 6: From the Layer>Properties>Symbology menu, change the colors of the Two reserves, add the names of the reserves as labels, and put appropriate labels for the extensions of the reserves that you are proposing.

Step 7: Assign appropriate colors for each layer in the Style tab of the Layer Properties menu. From the zip file downloaded above, add the roads (Routes.shp), other forested areas (region_boise.shp) and water bodies (Region_hydrique.shp) to the map canvas. Then, using the Print layout dialog (>Project>New Print Layout), generate a map that shows the reserves you have digitized, complete with a title, a North Arrow, Labels, and a legend. Note: When you open the Print Layout, you are presented with an empty page. To add the map, you need to click on the “Add new map to the layout” button and click-drag the area on the map where you want the map to appear.

CHALLENGE: Add the pipelines and powerlines to the map, that you will download from Open Street Maps using the Quick OSM plugin.

Exercise 2 - Buffers and basic analyses

Objective: Find out the proportion of voters living within 500 m of a forested area larger than 10 hectares in the Jacques-Cartier provincial electoral district in the Montreal West Island.

Step 1: Start a new project and open the files named 'sections_vote_31h5.shp' and 'regions_boise.shp'. Convert these files to the NAD83/UTM 18N CRS by saving them as new layers in a new Geopackage (right click on the layer name in the left menu Export>Save feature as).

Step 2: Start a new project again and set the project CRS to NAD83/UTM 18N. Add the two layers you just saved.

Step 3: Open the Regions boisées attribute table (right click… Open attribute table) and add a column containing the area in hectares of each forest. To do this, toggle the layer editing (pencil icon) and use the Field Calculator (small calculator icon) to create the new column (use the Geometry>$area operator). Note that the UTM coordinate system is in meters and 1 hectare = 10,000 square meters. Important: Specify 'Decimal Number (real)', put the output width at 15 and increase the precision to 2. Step 4: Exit Edit mode and create a Filter (>Layer>Filter) to isolate the forested areas larger than 10 hectares. Note that you can't perform a query while you are in Edit mode. Step 5: Create a new shapefile with a 500 m buffer surrounding the forested areas (>Vector>Geoprocessing tools>Buffer). Step 6: Add an area column to the sections_vote_31h5 attribute table. Step 7: Create a Filter to select only the Jacques-Cartier electoral district (TRI_CEP column). Step 8: Run the 'Basic Statistics for fields' function under Vector>Analysis tools to calculate the total number of voters in Jacques-Cartier (2008_12_08 column). Write this number down. Step 9: Use the Clip function (>Vector>Geoprocessing tools>Clip) to generate a vector containing only the electoral zone sections (input layer) of the Jacques-Cartier district within the buffer around the forests (clip layer). You will notice that the size of some districts has significantly decreased. Step 10: To estimate the number of voters within the buffer, we will assume that the number of voters is proportional to the area of each voting zone and calculate the proportion of each zone located within the buffer. Open the attribute table of the Shapefile created in Step 9. Create a column containing the new area ($area) of each disctrict after the clip. Now, create a column with the number of voters (2008_12_08 column) multiplied by the proportion of the area of each zone falling within the buffer:

2008_12_08*(area after clip)/(total area previously created in step 6).

Step 11: Use the 'Basic Statistics for fields' (Vector>Analysis tools) function to find the sum of the 2008_12_08 column. Calculate the proportion by hand using the answer from Step 8.

CHALLENGE : Find which of the electoral districts shown on the map contains the largest cumulative length of roads. Use the Dissolve and Sum line lengths (in Vectors menu) functions.

Exercice 3 - Interpolation and raster manipulation with QGIS

Objective: create two maps to compare the spatial distribution of ovenbird populations in Quebec from 1980-1994 and from 1995-2010, using the data from the Breeding bird survey (BBS).

Step 1: Start a new project.

Step 2: Add the province.shp to the canvas and select the province of Quebec with the “Select feature” icon. Now save it as a new layer (ex. qc_outline) in a new Geopackage (ex. quebec) with the NAD 83/Quebec Lambert CRS, by right-clicking on the layer name and choosing “Export>Save Feature as..” and then click on “save only selected features”.

Step 3: Add the Comma separated value file named 'BBS_Routes_QC.csv' to the map canvas using the function Add layer>'Add delimited text layer'. Choose Point coordinates under Geometry definition and Latitude and Longitude as the Y and X fields. Specify the CRS WGS84 (Geographic).

Step 4: Save this layer as a new layer 'BBS_Routes_QC' in the geopackage above with the CRS: NAD83 / Quebec Lambert.

Step 5: Remove the shapefile layers and just keep the new layers from steps 2 and 4 to the canvas. Make sur the CRS of the canvas is NAD83 / Quebec Lambert.

Step 6: Add the (non-spatial) table 'BBS_Ovenbird_QC.csv' to the canvas using 'Layers>Add Layer>Add Vector Layer' and by choosing All files in the file types (yes, this is counterintuitive…)

Step 8: From the menu 'Layer properties' of the layer 'BBS_Routes_QC', find the Joins tab, click on the (+) symbol and join the table BBS_Routes and BBS_Ovenbird_QC using the shared column 'Route' (select Route as both the source and the target columns). Disable the option to “Cache join layer in virtual memory”. You should now see the number of ovenbird observations per year in the attribute table of the 'BBS_Routes_QC' file.

Step 9: Activate the editing mode and use the 'Field Calculator' to add a new column (real, precision 2) to the table BBS_Routes containing the sum of the number of bird observations between 1980-1994 and 1995-2010. Each field has to be selected one by one from the Fields and Values list (sorry). Column names should have less than 10 characters, should start with a letter and should contain no spaces or special characters.

Step 10: Save the changes and exit editing mode. Note that only the new columns that were created are really part of the BBS_Routes_QC table. Now remove the join (from the Properties menu) to remove the yearly observations. If you see a series of NULL values in the newly computed columns, close the attribute table and open it again.

Step 11: You will now create a continuous interpolated surface showing the distribution of ovenbirds for each period. Make sure that the map canvas uses the same CRS as that of the BBS_Routes_QC layer. Find IDW Interpolation under the Processing Toolbox menu. Specify BBS_Routes_QC as the input vector layer and choose the column corresponding to the earliest period as the interpolation attribute field. You need to define the number of columns, number of rows and Extent to obtain an output raster with a resolution of 2km x 2km *exactly* so that you can include most of the points (you can exclude points in Northern Quebec), but without extending too much outside of the extent covered by the points. Don't hesitate to use a calculator and the image below:

Step 12: Use the Clip Raster by Mask Layer (under Raster>Extraction) to remove the part of the Interpolated raster map you just created that falls outside of the boundaries of the province. Choose province.shp as the mask layer (make sure it's in the right CRS).

Repeat Step 11-12 for the later (1995-2012) time period.

Step 13: In the properties of the first layer, develop a new color palette appropriate for the values of the layer (Properties>Symbology>Render Type>Single band pseudocolor). Choose a linear color interpolation, click on the “+” symbol and define three relevant number values associated with each color (you can click on the raster with the 'Identify Features' tool to see the range of values). Choose three associated colors that give a good contrast. Choose to Save the style (bottom right). For the other raster covering the later period access the Style properties and Load the style you just saved. Now the same palette should be assigned to the two periods. Do you notice any difference in the spatial distribution of the ovenbirds over time?

CHALLENGE 1: Add a colum to the BBS Routes file showing the mean annual temperature (from the layer Quebec_mat_tenths.txt in degrees C x 10) in a radius of 20 km around each route. You will need the function 'Zonal Statistics' which can be found in the Processing Toolbox.

CHALLENGE 2: Regenerate your interpolated raster maps for each time periods using the Multi-level B-Spline Interpolation tool in Saga. Use the Processing toolbox for this. Compare those maps with the ones you obtained with Inverse Distance Weighting.

For this exercise, you will need to extract the mean elevation and the land cover at occurence sites of the Global Biodiversity Information Facility (GBIF) falling within the Wemindji aboriginal territory, in the James Bay area of Quebec. You will want to work with the UTM Zone 17N / NAD83 reference system. Note that to reproject a raster, the preferred way is to use Raster>Projections>Warp. To achieve the objective, you will need to complete the following steps:

• On the Canada Open Government website, download raster elevation files for zones 33D and 33E at the 1:250,000 scale.
• Download this ZIP package containing a tif file with a recent land cover classification of the area and the associated style/colormap in qml format.
• On the MERN website (https://mern.gouv.qc.ca/territoire/portrait/portrait-donnees-mille.jsp), download the “Découpages administratifs”, “Municipalités, TNO et territoires autochtones” dataset as a shapefile. You will want the polygon layer.
• Download this file containing the GBIF species occurences in the Wemindji region. Note that this file is TAB delimited and the coordinates are in Latitude, longitude (WGS84). You can open it in a text editor to explore it's content.
• Using the latitude, longitude coordinates, add the GBIF occurences to the map canvas (it is TAB delimited).
• Merge the elevation raster layers (.tif) into one using Raster>Miscellaneous>Merge. Save the output as a .tif file.
• Use a filter to isolate the Wemindji territory (MUS_MN_MUN column)from the municipalities layer.
• Clip the occurrences to obtain only those within the Wemindji territory.
• Use the 'point sampling tool' plugin to extract the name of each species in latin, the land cover and the elevation at each occurrence location. Note that some locations contain a large number of occurrences.

CHALLENGE: Use the Kernel Density Estimation tool in SAGA (Processing toolbox) to obtain a raster showing the density of all bird occurrences in the GBIF occurrence data provided.

Exercise 5 - Manipulation of rasters and intro to satellite imagery

Objective: create an RGB false color composite image and an NDVI (Normalized Difference Vegetation Index) map, associate it with a color palette and extract the mean NDVI for parcs and fields of the region. Find which park is least vegetated

Step 1: We will first create a false-color composite image to better visualize the contrasts. To do so, find the merge function under Raster>Miscellaneous. Choose Landsat images 4,3 and 2 (you might need to click on them in that order, depending on your operating system) as the input files, specify Landsat_432.tif as the Output, and select “Put each input file into a separate band” option. In the Symbologie properties of the new layer, the Landsat band 4 should become band 1 (Red), Landsat band 3 should become band 2 (Green) and Landsat band 2 should become band 3 (Blue). Yes, this is confusing.

Step 2: In the Symbology section, you'll notice that “Multiband color” is used as the Render type. You can play with the brightness, contrast and saturation to improve the visual appearance of the image. Notice how this image can easily allow you to distinguish the vegetation, suburbs, cities and rivers. You can also create a 5-4-3 composite and compare the two images.

We will now create a Normalized Difference Vegetation Index image.

Step 3: Find Raster calculator function in the Processing toolbox. Specify 'NDVI' as the name of the output layer (GeoTiff). You then need to find bands 3 and 4 in the left menu, and come up with this formula in the Expression box:

(band4-band3)/(band3+band4)

Select to add the NDVI image to the canvas.

Step 4: In the properties menu of that layer, select Single band pseudo color as the Render type, choose the RdYlGn color map and click on Classify. On this map, dark green areas represent areas that are densely vegetated while reddish areas are not vegetated.

Step 5: Add the layer containing the parcs and fields (parcs_terrains_sports.shp) to the canvas and save it as a new file with the CRS NAD83 / UTM 18N. Add the new file to the canvas and remove the old one.

Step 6: Find the Zonal statistics menu in the Procesing Toobox. Choose the parcs et terrains as the vector layer and the NDVI image as the raster layer.

Step 7: By looking at the attribute table of the parcs et terrains layer, you will see that columns were added with the statistics of the NDVI cells for each field and park. Which park has the the lowest Mean NDVI (click on the column name to sort)? Where is it located (Click on the zoom to selection icon)?

CHALLENGE: Go to the USGS EarthExplorer website http://earthexplorer.usgs.gov/ and create an account by clicking on register. Then, choose a region anywhere in the world about the size of the Montreal region that you think could have seen a lot of land use change in the last 30 years. Search for a Landsat 4-5TM (Collection 1 Level 1) image dating anywhere between 1980-1990 for that region and taken in the summer months that contains less than 20% clouds. Download that file as a Level 1 product. Note the path/row information of this image and search for an equivalent image for 2010-2015 (same months, row and path). Bring them to QGIS, and calculate an NDVI value for each time period. Substract the later NDVI from the earlier NDVI and view the file with this palette (right-click the link…save as) to see the difference in vegetation cover between the two time periods.

CHALLENGE 2 Use the Cluster Analysis for Grids function in Processing Toolbox>SAGA to perform an unsupervised classification of your images using bands 3,4,5 and 7.

Exercise 6 - Using GRASS with QGIS Processing toolbox

Objective: Isolate the largest contiguous patch of land that is not covered by water and that is at least 1km from roads.

Step 1: Save the file routes.shp as a new layer with the CRS NAD83 / UTM 18N. Do the same for the Region_Hydrique shapefile.

Step 2: Convert the routes and the Region_Hydrique files to raster format using the v.to.rast function. Choose the “Source for raster values” as “va”, and put 1 in the “Raster value (use=val)”. Do the same for the regions hydriques.

Step 3: Use the r.grow.distance function to create a continuous raster map in which each pixel is assigned the distance from the closest route. Use the routes layer in raster format (Step 6) as the input.

Step 4: Use the r.null.to function to replace the value of NULL with 0 in the Region Hydrique raster file. Do the same for the r.grow.distance file.

Step 5: Use the r.mapcalc.simple function to remove the areas covered with water form the distance map you just created. Set layer A as the Region Hydrique raster layer from step 4 and B as the distance file created in Step 4.

r.mapcalc expression="((A-1)*-1)*B"

Step 6: To exclude areas closer than 1km from roads, run r.reclass on the distance map from step 5 with the following reclass rule:

0 thru 1000 = 0
1000 thru 99999 = 1

Step 7: Use the r.clump function to give every isolated contiguous area a unique identifier.

Step 8: Use the r.stats function to obtain the area of each contiguous area, or patch. Unclick One cell (range) per line and click “Print area totals” and “Print category labels”. Choose “desc” for Sort output statistics by cell count. Then, click on the html file in the Results viewer and identify the patch ID with the second largest area (largest is the background).

Step 9: Use r.reclass on the output from Step 8, with the following rule

ID = 1
* = NULLL

where you replace ID with the ID of the largest patch. This will have created a raster map named Largest_patch in which the largest patch is isolated with a value of 1. You can now display this raster and change the Symbology to view the isolated patch.

Step 10: Convert this file to vector by using the r.to.vect function and add it to the canvas.

GRASS Terminology

• DATABASE (Géodatabase): Main GRASS directory.
• LOCATION (Secteur): Working directory. SCR unique.
• MAPSET (Jeu de données): Sub-directory under the LOCATION.
• REGION (Région): Définie par une étendue et une résolution spécifiques.
prefix Description Example
d.* display (graphical output) d.rast: views raster map, d.vect: views vector map
db.* database management db.select: selects value(s) from table
g.* general file operations g.rename: renames map
r.* raster data processing r.buffer: buffer around raster features, r.mapcalc: map algebra
i.* image processing i.smap: image classifier
v.* vector data processing v.overlay: vector map intersections
ps.* postscript map creation format ps.map: map creation in Postscript,
r3.* raster voxel data processing r3.mapcalc: volume map algebra

The full list of GRASS commands is here.