This is an old revision of the document!


Introduction to Open Source Geographic information systems with QGIS and GRASS

Quebec Centre for Biodiversity Science, https://qcbs.ca

Guillaume Larocque, Research Professional. (guillaume.larocque@mcgill.ca)

October 23 and 24, 2018. McGill University

QGIS 3 and GRASS 7 will be used for this workshop. Please make sure that you have the appropriate versions installed prior to the workshop.

Install QGIS and GRASS following these instructions.

General resources

QGIS

GRASS

Portals for downloading data

The Geographic Resources Analysis Support System (GRASS) is one of the oldest Geographic Information Systems still in existence today. It was originally developed in the early 1980s by the US Army as a software management tool for military applications. What was originally a command-line software running on Unix systems has now evolved into a multi-platform open source system with capabilities for raster and vector display and analysis, image processing, remote sensing, database management, and more. In recent years, GRASS became part of the Open Source Geospatial Foundation (OSGeo), a not-for-profit organization whose mission is to support and promote the collaborative development of open geospatial technologies and data. This foundation also supports multiple other projects for web mapping and desktop GIS, as well as various geospatial libraries for spatial data conversion, storage and interoperability. One of the projects supported by OSGeo is Quantum GIS (QGIS), a user-friendly, multi-platform, raster and vector desktop GIS which provides a point-and-click interface with functionalities for displaying, editing and analysing geospatial data. These functionalities can be greatly expanded through plugins that are either part of QGIS or that are supplied by third-parties. An interesting aspect of QGIS is the integration with GRASS, allowing the user to perform a number of GRASS operations from within QGIS. The combination of varied plugins and GRASS support turn QGIS into a full-fledge GIS with a strong analytical toolbox.

List of other Open Source or free GIS desktop programs of potential interest:

  • uDIG - An interesting and user friendly GIS written in Java for data access, editing, and viewing. It is under active development at the moment.
  • SAGA GIS - (System for Automated Geoscientific Analyses) A powerful GIS mostly aimed at data analysis and spatial algorithms.
  • Whitebox - Powerful software for geo-spatial analysis, modeling and remote sensing.
  • ILWIS Open - A remote sensing and GIS software which integrates image, vector and thematic data in one package on the desktop.
  • gvSIG - A GIS currently in development having a user-friendly interface, being able to access the most common formats, and featuring a wide range of tools for query, layout creation, geoprocessing, networks, etc.
  • OpenJUMP GIS - The current version of OpenJUMP can read and write shapefiles other simple files, has limited support for the display of images and good support for showing data retrieved from web-services. It can be used as a GIS Data Viewer, or for editing geometry and attribute data.

Note: You can compare the possibilities offered by modern day GIS with the first ever GIS, created in 1965 for the Canada Land Inventory. See these videos on Youtube: Data for decision Part I, Part II et Part III.

  • Open QGIS - Click on Plugins­ > Manage Plugins, and make sur fTools and GdalTools plugins are activated.
  • Now you will see a dropdown menu in QGIS with options for vector analysis (fTools), raster analysis (GdalTools) and multiple other plugins (under plugins).
  • Go to Settings > Options ­­> CRS, and click on 'Enable on the fly reprojection by default'.
  • Start a New Project or close and reopen QGIS.

Layers in different reference systems will then be reprojected on the fly.

Click here to download a ZIP file containing all the files needed for the exercises below.

5 - Exercises

Exercise 1 - Digitizing and map display

Ojective: Create a map of the two McGill forests in the Montreal West Island: The Morgan Arboretum and the Molson Reserve. Draft a plan for the expansion of these reserves and create nice maps outlining your plan.

Step 1: Open a text editor (for example, Notepad in Windows or TextEdit on Mac) and copy the following latitude , longitude coordinates in decimal degrees to it.

ID,Name,Long,Lat
1,MorganArboretum,-73.9541,45.4303
2,MolsonReserve,-73.9763,45.3943

Then save as a text file with a .csv extension and add this as a vector layer to the QGIS map canvas with the >Layer>'Add Delimited Text Layer' function (if this isn't showing up for you, activate it in the plugins). Use Longitude as the X column and Latitude as the Y column. Specify >Geographic coordinate systems>WGS 84 as the CRS. Now, you should see two points on your screen indicating the center of each reserve.

Step 2: Add a Google Satellite layer by clicking on the Browser>XYZ tiles. Right-click and select Ne connection. For the name, specify Google Satellite. For the URL, put http://mt0.google.com/vt/lyrs=y&hl=en&x={x}&y={y}&z={z}&s=Ga . Note that the CRS of the canvas is now (WGS84/ Pseudo Mercator). Move this layer to the bottom in the Layers list. You can find other sources of tiles here (https://leaflet-extras.github.io/leaflet-providers/preview/)

Step 3: Click on the CRS modification button on the bottom right of your screen and change the CRS to NAD83/UTM Zone 18N.

Step 4: To digitize the two reserves, you need to add a New empty polygon layer (>Layer>Create Layer>Shapefile layer>Polygon). Specify the CRS as NAD 83/UTM zone 18N. A column/attribute for ID (integer) is already specified, add one for Name (Text data) by specifying a Name in the 'New field' section and Clicking Add to fields list. Give the shapefile layer an appropriate file name (ex. McGill_Reserves.shp) and make sure it's in a appropriate folder on your computer by clicking on the … menu. Click OK and Enable Edit mode by left-clicking on the layer name in the left menu and selecting 'Toggle editing'. Digitize the McGill Molson Reserve and the Morgan Arboretum from the satellite image. To help you locate those areas, use the points which you imported from the CSV file. For the Molson Reserve, digitize the forested area located between the residential street (Blvd. Perrot) and highway 20 by clicking on the digitize icon: . For the Arboretum, digitize the contiguous forested area surrounding the point. The Arboretum is approximately 250 hectares. Now add polygons adjacent to those polygons showing what you propose as extensions to the existing reserves. For two adjacent polygons to properly share a common boundary, you need to digitize the second polygon so as to make it overlap slightly the first one. For this to work, it is important to select the 'Enable snapping on intersections.' for that layer in the > Settings ­­­­> Snapping Options dialog.

Step 5: When done, exit Edit mode and save the layer.

Step 6: From the Layers properties menu, change the colors of the Two reserves, add the names of the reserves as labels, and put appropriate labels for the extensions of the reserves that you are proposing.

Step 7: Assign appropriate colors for each layer in the Style tab of the Layer Properties menu. Add the roads (Routes.shp), other forested areas (region_boise.shp) and water bodies (Region_hydrique.shp) to the map canvas. Then, using the Print layout dialog (>Project>New Print Layour), generate a map that shows the reserves you have digitized, complete with a title, a North Arrow, Labels, and a legend. Note: When you open the Print Layout, you are presented with an empty page. To add the map, you need to click on the “Adds new map to the layout” button and click-drag the area on the map where you want the map to appear.

CHALLENGE: Add the pipelines and powerlines to the map, that you will download from Open Street Maps using the Quick OSM plugin.

Exercise 2 - Buffers and basic analyses

Objective: Find out the proportion of voters living within 500 m of a forested area larger than 10 hectares in the Jacques-Cartier provincial electoral district in the Montreal West Island.

Step 1: Start a new project and open the files named 'sections_vote_31h5.shp' and 'regions_boise.shp'. Convert these files to the NAD83/UTM 18N CRS by saving them as new files (right click on the layer name in the left menu>Save as).

Step 2: Start a new project again and set the project CRS to NAD83/UTM 18N. Add the two files you just saved.

Step 3: Open the Regions boisées attribute table and add a column containing the area in hectares of each forest. To do this, toggle the layer editing and use the Field Calculator (small calculator icon at the bottom) to create the new column (use the Geometry>$area operator). Note that the UTM coordinate system is in meters and 1 hectare = 10,000 square meters. Important: Specify 'Decimal Number (real)', put the output width at 15 and increase the precision to 2.

Step 4: Exit Edit mode and create a Filter (>Layer>Filter) to isolate the forested areas larger than 10 hectares. Note that you can't perform a query while you are in Edit mode.

Step 5: Create a new shapefile with a 500 m buffer surrounding the forested areas (>Vector>Geoprocessing tools>Buffer).

Step 6: Add an area column to the sections_vote_31h5 attribute table.

Step 7: Create a Filter to select only the Jacques-Cartier electoral district (TRI_CEP column).

Step 8: Run the 'Basic Statistics for fields' function under Vector>Analysis tools to calculate the total number of voters in Jacques-Cartier (2008_12_08 column). Write this number down.

Step 9: Use the Clip function (>Vector>Geoprocessing tools>Clip) to generate a vector containing only the electoral zone sections (input layer) of the Jacques-Cartier district within the buffer around the forests (clip layer). You will notice that the size of some districts has significantly decreased.

Step 10: To estimate the number of voters within the buffer, we will assume that the number of voters is proportional to the area of each voting zone and calculate the proportion of each zone located within the buffer. Open the attribute table of the Shapefile created in Step 9. Create a column containing the new area ($area) of each disctrict after the clip. Now, create a column with the number of voters (2008_12_08 column) multiplied by the proportion of the area of each zone falling within the buffer:

2008_12_08*(area after clip)/(total area previously created in step 6). 

Step 11: Use the 'Basic Statistics for fields' (Vector>Analysis tools) function to find the sum of the 2008_12_08 column. Calculate the proportion by hand using the answer from Step 8.

Answer

CHALLENGE : Find which of the electoral districts shown on the map contains the largest cumulative length of roads. Use the Dissolve and Sum line lengths (in Vectors menu) functions.

Answer

Exercice 3 - Interpolation and raster manipulation with QGIS

Objective: create two maps to compare the spatial distribution of ovenbird populations in Quebec from 1980-1994 and from 1995-2010, using the data from the Breeding bird survey (BBS).

Step 1: Start a new project.

Step 2: Add the province.shp to the canvas and select the province of Quebec with the “Select feature” icon. Now save it as a new shapefile with the NAD 83/Quebec Lambert CRS, by right-clicking on the layer name and choosing “Save as..” and then click on “save only selected features”.

Step 3: Add the Comma separated value file named 'BBS_Routes_QC.csv' to the map canvas using the function Add layer>'Add delimited text layer' (Choose 'Selected delimiters'→comma) and specify the CRS WGS84 (Geographic).

Step 4: Save this layer as a shapefile named 'BBS_Routes_QC.shp' with the CRS: NAD83 / Quebec Lambert.

Step 5: Start a new project and specify the CRS NAD83 / Quebec Lambert. Add the files from steps 2 and 3 to the canvas.

Step 6: Add the (non-spatial) table 'BBS_Ovenbird_QC.csv' to the canvas using 'Layers>Add Layer>Add Vector Layer' and by listing all the file types (yes, this is counterintuitive…)

Step 8: From the menu 'Layer properties' of the table 'BBS_Routes_QC.shp', find the Join tab and join the table BBS_Routes and BBS_Ovenbird_QC using the shared column 'Route' (select Route as both the source and the target columns). Disable the option to “Cache join layer in virtual memory”. You should now see the number of ovenbird observations per year in the attribute table of the 'BBS_Routes_QC' file.

Step 9: Activate the editing mode and use the 'Field Calculator' to add a new column (real, precision 2) to the table BBS_Routes containing the sum of the number of bird observations between 1980-1994 and 1995-2010. Each field has to be selected one by one from the Fields and Values list (sorry). Column names should have less than 10 characters, should start with a letter and should contain no spaces or special characters.

Step 10: Save the changes and exit editing mode. Note that only the new columns that were created are really part of the BBS_Routes_QC table. Now remove the join (from the Properties menu) to remove the yearly observations. If you see a series of NULL values in the newly computed columns, close the attribute table and open it again.

Step 11: You will now create a continuous interpolated surface showing the distribution of ovenbirds for each period. Make sure that the map canvas uses the same CRS as that of the BBS_Routes_QC file. Find >Grid (Interpolation) under >Raster>Analysis. Specify BBS_Routes_QC as the input and choose the column corresponding to the earliest period as the Z field. Use 'Inverse distance to a power' as the Algorithm. You need to define the Width (number of columns), Height (number of rows) and Extent to obtain an output raster with a resolution of 2km x 2km *exactly* so that you can include most of the points (you can exclude points in Northern Quebec), but without extending too much outside of the extent covered by the points. Don't hesitate to use a calculator and the image below:

Step 12: Use the Clipper function (under Raster>Extraction) to remove the part of the Interpolated raster map you just created that falls outside of the boundaries of the province. Choose province.shp as the mask layer.

Repeat Step 11-12 for the later (1995-2012) time period.

Step 13: In the properties of the first layer, develop a new color palette appropriate for the values of the layer (Properties>Tab Style>Render Type>Single band pseudocolor). Choose a linear color interpolation, click on the “+” symbol and define three relevant number values associated with each color (you can click on the raster with the 'Identify Features' tool to see the range of values). Choose three associated colors that give a good contrast. Choose to Save the style (bottom right). For the other raster covering the later period access the Style properties and Load the style you just saved. Now the same palette should be assigned to the two periods. Do you notice any difference in the spatial distribution of the ovenbirds over time?

CHALLENGE 1: Add a colum to the BBS Routes file showing the mean annual temperature (from the layer Quebec_mat_tenths.txt in degrees C x 10) in a radius of 20 km around each route. You will need the extension 'Zonal Statistics' which can be found in '>Raster>Zonal Statistics' once installed.

CHALLENGE 2: Regenerate your interpolated raster maps for each time periods using the Multi-level B-Spline Interpolation tool in Saga. Use the Processing toolbox for this. Compare those maps with the ones you obtained with Inverse Distance Weighting.

Exercice 4 - Downloading files, recap and challenge!

For this exercise, you will need to extract the mean elevation and the land cover at occurence sites of the Global Biodiversity Information Facility (GBIF) falling within the Wemindji aboriginal territory, in the James Bay area of Quebec. You will want to work with the UTM Zone 17N / NAD83 reference system. Note that to reproject a raster, the preferred way is to use Raster>Projections>Warp.To achieve the objective, you will need to complete the following steps:

  • On the Canada Open Government website, download raster elevation files for zones 33D and 33E at the 1:250,000 scale.
  • Download this ZIP package containing a tif file with a recent land cover classification of the area and the associated style/colormap in qml format.
  • On Open Government website, download the shapefile of Aboriginal Lands of Canada Legislative Boundaries. The shapefile of interest is the one that ends with _MODIFIED
  • Download this file containing the GBIF species occurences in the Wemindji region. Note that this file is TAB delimited and the coordinates are in Latitude, longitude (WGS84). You can open it in a text editor to explore it's content.
  • Merge the elevation raster layers (.tif) into one using Raster>Miscellaneous>Merge. Save the output as a .tif file.
  • Using the latitude, longitude coordinates, add the GBIF occurences to the map canvas.
  • Clip the occurrences to obtain only those within the Wemindji territory.
  • Use the 'point sampling tool' plugin to extract the name of each species in latin, the land cover and the elevation at each occurrence location. Note that some locations contain a large number of occurrences.

CHALLENGE: Use the Kernel Density Estimation tool in SAGA (Processing toolbox) to obtain a raster showing the density of all bird occurrences in the GBIF occurrence data provided.

Exercise 5 - Manipulation of rasters and intro to satellite imagery

Objective: create an RGB false color composite image and an NDVI (Normalized Difference Vegetation Index) map, associate it with a color palette and extract the mean NDVI for parcs and fields of the region. Find which park is least vegetated

Step 1: We will first create a false-color composite image to better visualize the contrasts. To do so, find the merge function under Raster>Miscellaneous. Choose Landsat images 4,3 and 2 (you might need to click on them in that order, depending on your operating system) as the input files, specify Landsat_432.tif as the Output, and select “Put each input file into a separate band” option. If you have selected the files in the correct order, the Landsat band 4 should become band 1 (Red), Landsat band 3 should become band 2 (Green) and Landsat band 2 should become band 3 (Blue). Yes, this is confusing.

Step 2: In the Style properties of this new Layer, you'll notice that “Multiband color” is used as the Render type. You can play with the brightness, contrast and saturation to improve the visual appearance of the image. Notice how this image can easily allow you to distinguish the vegetation, suburbs, cities and rivers. You can also create a 5-4-3 composite and compare the two images.

We will now create a Normalized Difference Vegetation Index image.

Step 3: Find Raster calculator function. Specify 'NDVI' as the name of the output layer (GeoTiff). You then need to find bands 3 and 4 in the left menu, and come up with this formula in the Expression box:

(band4-band3)/(band3+band4)

Select to add the NDVI image to the canvas.

Step 4: In the properties menu of that layer, select Single band pseudo color as the Render type, choose the RdYlGn color map and click on Classify. On this map, dark green areas represent areas that are densely vegetated while reddish areas are not vegetated.

Step 5: Add the layer containing the parcs and fields (parcs_terrains_sports.shp) to the canvas and save it as a new file with the CRS NAD83 / UTM 18N. Add the new file to the canvas and remove the old one.

Step 6: Install and enable the “Zonal statistics” plugin.

Step 7: Find the Zonal statistics menu under Raster. Choose the parcs et terrains as the vector layer and the ndvi image as the raster layer.

Step 8: By looking at the attribute table of the parcs et terrains layer, you will see that columns were added with the statistics of the NDVI cells for each field and park. Which park has the the lowest Mean NDVI? Where is it located?

Answer

CHALLENGE: Go to the USGS EarthExplorer website http://earthexplorer.usgs.gov/ and create an account by clicking on register. Then, choose a region anywhere in the world about the size of the Montreal region that you think could have seen a lot of land use change in the last 30 years. Search for a Landsat 4-5TM (Collection 1 Level 1) image dating anywhere between 1980-1990 for that region and taken in the summer months that contains less than 20% clouds. Download that file as a Level 1 product. Note the path/row information of this image and search for an equivalent image for 2010-2015 (same months, row and path). Bring them to QGIS, and calculate an NDVI value for each time period. Substract the later NDVI from the earlier NDVI and view the file with this palette (right-click the link…save as) to see the difference in vegetation cover between the two time periods.

CHALLENGE 2 Use the Cluster Analysis for Grids function in Processing Toolbox>SAGA to perform an unsupervised classification of your images using bands 3,4,5 and 7.

Exercise 6 - Using GRASS within QGIS

Objective: Isolate the largest contiguous patch of land that is not covered by water and that is at least 1km from roads.

Step 1: Start QGIS by clicking on “QGIS with GRASS support”. Make sure that the GRASS extension is activated, that the GRASS icons bar is selected in >View>Toolbar, and that it is not hidden somewhere in your icons bar. Click on the Plugins>GRASS> New mapset and define a new GRASS Database, a new Location (name it workshop2), choose the CRS NAD83 / UTM 18 N, specify the default GRASS Region by using the current QGIS extent and specify a name for the Mapset.

Step 2: Save the file routes.shp as a new layer with the CRS NAD83 / UTM 18N. Do the same for the Region_Hydrique shapefile.

Step 3: You now need to import the files into GRASS. Click on 'Open GRASS tools' (if the tab is not already active), click on the 'Module tree'>'File management'>'Import into GRASS'>'Import a vector into GRASS'>'v.in.ogr.qgis'. Select the route file you created in step 2 and specify a name for the output file. Repeat this step for the Region hydrique file. Add the newly created GRASS files to the map canvas. Using the QGIS Browser, you can add GRASS files by finding your GRASS Database folder, opening the desired mapset and location, and dragging and dropping files from the Browser to the Layers Panel.

Step 4: Find the function called r.in.gdal and import the 31h05dem.tif file to GRASS. Add this new GRASS layer to the QGIS map canvas.

Step 5: Find the function g.region.multiple.raster and use the file imported in Step 4 to define the current region (specify the exact name of the raster layer in the box).

Step 6: Convert the routes file to raster format using the v.to.rast.constant function. Do the same for the regions hydriques.

Step 7: Use the r.grow.distance function to create a continuous raster map in which each pixel is assigned the distance from the closest route. Use the routes layer in raster format (Step 6) as the input.

Step 8: Use the r.null.to function to replace the value of NULL with 0 in the Region Hydrique raster file.

Step 9: Use the r.mapcalculator function to remove the areas covered with water form the distance map you just created. Set layer A as the Region Hydrique raster layer and B as the File created in Step 7. Click on the “Use region of this map” icon beside layer A. In the formula box, type

((A-1)*-1)*B

and give a name to the output file.

Step 10: To exclude areas closer than 1km from roads, create a text file (in text edit or Notepad) with the following content:

0 thru 1000 = 0
1000 thru 99999 = 1

Step 11: Use the r.reclass function to reclassify the raster you created in Step 9 and using the reclass rules file (Step 10).

Step 12: Use the r.clump function to give every isolated contiguous area a unique identifier.

Step 13: Use the r.stats function to obtain the area of each contiguous area, or patch. You need to click on Advanced Options, select “Print area totals” and put “-” as the Name for output file. You could also specify file name with a .txt extension if you want the output to be save in a text file.

Step 14: Identify the patch with the largest area. Find the GRASS Shell in the list of functions and write the following command:

r.mapcalc "Largest_patch=if(Step_12  == ID, 1, null())" 

where you replace ID with the ID of the largest patch, and Step_12 with the name that you gave to the raster in Step 12. This will have created a raster map named Largest_patch in which the largest patch is isolated with a value of 1. You can now display this raster and change the Symbology to view the isolated patch.

Step 15: Convert this file to vector by using the r.to.vect.area function.

GRASS Terminology

  • DATABASE (Géodatabase): Main GRASS directory.
  • LOCATION (Secteur): Working directory. SCR unique.
  • MAPSET (Jeu de données): Sub-directory under the LOCATION.
  • REGION (Région): Définie par une étendue et une résolution spécifiques.
prefix Description Example
d.* display (graphical output) d.rast: views raster map, d.vect: views vector map
db.* database management db.select: selects value(s) from table
g.* general file operations g.rename: renames map
r.* raster data processing r.buffer: buffer around raster features, r.mapcalc: map algebra
i.* image processing i.smap: image classifier
v.* vector data processing v.overlay: vector map intersections
ps.* postscript map creation format ps.map: map creation in Postscript,
r3.* raster voxel data processing r3.mapcalc: volume map algebra

The full list of GRASS commands is here.