title

author

date

papersize

fontsize

site

description

github-repo

graphics

bibliography

QGIS for Transport Research: an introduction

Robin Lovelace and Malcolm Morgan

2019-03-11

a4

12pt

bookdown::bookdown_site

A book on using open source Geographic Information System for transport planning.

ITSLeeds/QGIS-intro

true

book.bib

packages.bib

Preface {-}

Transport is inherently geographic. All transport involves movement from a place on the Earth to another. This means that Geographic Information Systems (GIS) have great potential to assist transport researchers. QGIS is a powerful GIS package that has become popular worldwide due to its wide range of features (including via hundreds of community contributed plugins), intuitive interface, and accessibility. QGIS is free and open source, meaning that anyone can download, install and even contribute to it.

The growing popularity, power and features of QGIS means that learning it now could save much time and money long into the future: QGIS is futureproof. A more mundane reason for writing this tutorial was the installation of QGIS 3 across the University of Leeds. Suddenly existing material, developed for QGIS 2, no longer worked. Instead of incrementally updating those materials we decided to start from a blank slate, allowing the inclusion of new datasets and links to excellent online content that already exists for QGIS.

The aim is to get you up-to-speed with the most important concepts and techniques in QGIS for Transport Planning. We refer to Transport Planning in the broad sense of providing evidence-based guidance to support sustainable transport behaviours. With air pollution, the obesity crisis, and growing levels of economic inequality in cities worldwide, this inevitably means investment in walking, cycling and public transport. For this reason, the data used in this booklet focusses on these modes.

Pre-requisites {-}

This course assumes that you have access to a working installation of QGIS 3 and an internet connection. We will explain how to download the data needed for the practicals in Chapter @ref(data).

\mainmatter

Introduction

What is GIS?

GIS stands for Geographic Information Systems. The term was first used in the 1960s and has since become a common way of referring to the analysis of geographic data.

That raises another question: what do we mean by 'geographic data'? The defining feature of geographic data is that it has coordinates that allow the position of records in the dataset to be located on Earth. There are two main types of geographic data:

Vector data, which typically represent points, lines and polygons. An example is a series of points representing a road.
Raster data, which typically represent a continuously changing surface, such as height. An example is a satellite image of a road.

The difference between vector and raster data types is illustrated in the figure below, which shows the same road (Woodhouse Lane in Leeds) represented in vector and raster geographic data forms.

Since graphical user interfaces (GUIs) for GIS became popular in the early 2000s, the word GIS has become almost synonymous with 'Dedicated GIS software'. Examples of dedicated GIS software include the open source GIS software QGIS (we will come onto what QGIS is in a subsequent section) and proprietary software such as ArcMap and TransCAD. It is also possible, and increasingly popular, to do GIS using a programming language with a command-line interface (CLI) such as Python or R (see @ref(next-steps))

What is QGIS?

QGIS is an open source dedicated GIS software program. It has become the most commonly used open source program for working with geographic data. Search engine data from Google shows that QGIS is becoming increasing popular. QGIS is ideal not only for learning GIS techniques but also using them in practice where software licensing and cost may be important. Figure @ref(fig:trends) shows how QGIS's popularity over time compares with the GIS product ESRI ArcMap and the transport planning product, PTV Visum.^[ QGIS does not have dedicate transport planning functionality like Visum ]

The growing popularity of QGIS, as measured by internet search traffic. Source: [trends.google.com](https://trends.google.com/trends/explore?date=all&q=qgis,arcmap,visum).

(\#fig:trends)The growing popularity of QGIS, as measured by internet search traffic. Source: [trends.google.com](https://trends.google.com/trends/explore?date=all&q=qgis,arcmap,visum).

Working with QGIS

Before importing data (covered in Chapter @ref(data)) it is worth getting to know QGIS, in terms of its main components, how to get help, and how it helps you organise your work into projects. This chapter describes some of QGIS's key elements and

Opening QGIS

Probably the quickest way to open QGIS on your computer press 'Windows button' on your keyboard and type 'qgis' (see Figure @ref(fig:qgis-start)).

(\#fig:qgis-start)Starting QGIS

Select 'QGIS Desktop' from the list. If you have multiple versions, choose the latest version. You should see a new window appear that contains the main features of the QGIS program (see Figure @ref(fig:qgis-window)).

Key QGIS components

These include the following main components, numbered from 1:5 in the figure and the bullet points below (source: the QGIS Manual):

Menu Bar: like most GUI-based programs you can control key aspects of QGIS and execute key commands, like saving your project and loading new datasets, by clicking Project or Layer. Note: shortcuts to access these menus from the keyboard are Alt+J and Alt+L, respectively.
Toolbars: these are small icons located towards the top and left-hand side of Figure @ref(fig:qgis-window). In addition to options available from the Menu Bar, these icons provide tools for interacting with the map such as Pan (the hand symbol) and Zoom (the + and - signs).
Panels: Panels are interactive elements that show information on particular aspects of the project. A view of files in the Browser Panel and the Layers Panel are shown in Figure @ref(fig:qgis-window).
Map View: this is where the geographic data is displayed in an interactive map for interactive visualisation.
Status Bar: this small but important element at the bottom of QGIS shows details about the current status of the Map View, such as the Coordinate Reference System (CRS), in this case, EPSG:3857 and scale.

(\#fig:qgis-window)QGIS main user interface features

Plugins

An important aspect of open source software is the wider community, which often supports a diverse range of extensions which add features to the original program. QGIS is no exception: it has a thriving community of programmers who, together, have created dozens of 'plugins', which enhance QGIS's capabilities in countless ways. At the time of writing, there are 200+ plugins available for QGIS 3, and this number is continuously rising.

To install a plugin, click on Plugins in the Menu Bar (or press Alt+P). You should see the following options. Select the second option (Manage and Install Plugins, see Figure @ref(fig:plugins) (left)). You should see a new window, like the one displayed in Figure @ref(fig:plugins) (right).

(\#fig:plugins)Plugins menu (left) and the resulting window (right).

To install a plugin, click on the 'Install plugin' button in the bottom right corner of Figure @ref(fig:plugins) (right). Use the search bar to explore for plugins. What happens if you search for 'trans', for example (short for transport)?

Exercise: install the QuickMapServices plugin and use it to put a basemap showing Leeds in the Map View. Hint: after you have installed the plugin navigate to the Web menu (or press Alt+W). After completing the exercise your QGIS session should look something like that displayed in Figure @ref(fig:webmap).

Loading basemaps

You can add a basemap by clicking on the 'XYZ Tiles' option in the Browser Panel. Options should include Google and OpenStreetMap. Clicking on one of these will add a new layer, which can be seen in the Layers menu.

Another way to add basemaps, with a wide range of default options, is via a plugin. We learned how to install plugins in the previous section so this alternative way is best learned as an exercise (see exercises section at the end of this chapter, and Figure @ref(fig:webmap)).

(\#fig:webmap)QGIS with a basemap from QuickMapServices

Panning and Zooming

You can naviagate around the map using the mouse or the buttons in the toolbar.

(\#fig:navigation)Navigation buttons: Pan Map, Pan to Selection, Zoom In, Zoom Out, Zoom to Native Resolution, Zoom Full, Zoom to Selection

Projects and Files

QGIS allows you to open multiple files at once and overlay them as layers on the same map. You can also customise the visualisation of the layers. To allow you to save your work it is recommended to save your work in project file, that should exist in a folder for your project. For dissertation project, for example, you could create a project called 'dissertation' inside the folder containing the dissertation. The project file would be called 'dissertation.qgz'.

Projects don't contain any data themselves but do contain information such as the current map view, links to data, and instructions on how data should be presented. Projects are an easy way to keep your work organised and allows you to stop and come back to work at a later date.

You can create, load, save and 'save as' projects using the buttons shown in Figure @ref(fig:projectbuttons) or using the options in the Project menu.

(\#fig:projectbuttons)New Project, Open Project, Save Project, Save Project As buttons

Note that because the Project does not contain any data itself you cannot simply move a project file from one computer to another. It would be necessary to move both the project and any associated files. You may also need to redirect the project to the new file locations on the new computer.

Projections and Coordinate Reference Systems

Map Projections. For more projections, see the source at [XKCD](https://xkcd.com/977/) or [Wikipedia](https://en.wikipedia.org/wiki/Map_projection).

(\#fig:projections)Map Projections. For more projections, see the source at [XKCD](https://xkcd.com/977/) or [Wikipedia](https://en.wikipedia.org/wiki/Map_projection).

When plotting a map you need X and Y coordinates to specify where objects should appear. While this is simple on a flat surface spatial data must fit onto the curved surface of the earth. You may know that it is impossible to unwrap a sphere into a single flat surface without distorting (stretching, twisting, cutting) the surface in some way. The process of making a flat map from a curved Earth is known as projection, and there are many valid ways to project a map.

Cartographers can argue intensely about their preferred projections as this famous XKCD comic alludes to. Coordinate Reference Systems (CRS) refer to different ways of defining the X and Y coordinates used in different projections. Largely they fall into two categories:

Geographical Coordinate Systems: use latitude and longitude to represent any place on the Earth
Projected Coordinate Systems: use distances from an origin point to represent a small part of the Earth, e.g. a country. The advantage of a projects CRS is that it is easier to calculate properties such as distance and area as coordinates are in metres.

You can find a catalogue of different CRSs at http://spatialreference.org/

CRSs are often referred to by the EPSG number. The European Petroleum Survey Group publish a database of different coordinate systems. Two useful projections to commit to memory are:

4326 - the World Geodetic System 1984 which is a widely used geographical coordinate system, used in GPS datasets and the .geojson file format, for example.
27700 - the British National Grid

Summary

This section introduced QGIS and its main components. Before moving on to the next section, in which we will import data into QGIS, ensure that you have:

Opened QGIS and created and name a project, saving it in an appropriate place on your computer.
Installed and tested the QuickMapServices plugin, and identified basemaps that are appropriate for use in transport planning

Bonus exercise: Find three plugins that may be useful in transport planning. Which has the best rating?

Downloading and loading data {#data}

A vital skill for doing using GIS skills to solve real-world problems is finding, downloading and importing data.

Often, the first stage in the data downloading/loading process is to find the data online. In this case, we will access data from the following site, which contains data we prepared earlier for the course: github.com/ITSLeeds/QGIS-into/releases.

Download and unzip the data.zip file. This file contains the data that you will use for the rest of the tutorial.

Before opening data files, you should first have created a QGIS project, covered in the previous chapter.

Importing spatial data

To load a spatial data file, click on the Data Source Manager button in the top left corner of QGIS (see Figure @ref(fig:data-source-manager)).

(\#fig:data-source-manager)The Data Source Manager icon

We will use the leeds_lsoa.shp example file to plot the boundaries of the Lower Layer Super Output Areas in Leeds.

Open the Data Source Manager and select “Vector” then specify the “File Name” and location, or use the … to navigate to the file.

Click “Add” then “Close”. Note the data manager does not close automatically after adding a file to your project, this is to allow you to add multiple files at once.

(\#fig:data-source-manager-shp)The Data Source Manager - Vector

The result should look something like the map displayed in Figure @ref(fig:imported).

(\#fig:imported)The map after the file 'leeds_lsoa.shp' has been imported.

Importing a CSV or text file

CSV with spatial data

Sometimes you have data that contains spatial information but is not in a spatial data format. A common example is to have a CSV file with latitude and longitude columns. CSV is a common format for storing data and can be opened by lots of different software, including Microsoft Excel. We will use the stats19.csv example file to plot the location of vehicles collisions in Leeds.

Open the Data Source Manager and select “Delimited Text” then specify the “File Name” and location, or use the … to navigate to the file. Under the “Geometry definition” select “Point coordinates” and set the “X field” to “longitude” and the “Y field” to “latitude”. Set the “Geometry CRS” to “EPSG: 4326 – WGS 84”.

Click “Add” then “Close”.

(\#fig:data-source-manager-csv)The Data Source Manager - Delimited Text

After importing the 'stats19' data, using menus shown in Figure @ref(fig:data-source-manager-csv), you should see points on the map, representing where crashes in Leeds took place in 2017. If you do not see dots on the map, re-read this section. If you do, congratuations!

CSV without spatial data

QGIS will also allow you to import non-spatial data. Import the population.csv example file for later use using the same process as above, but selecting “No geometry (attribute only table)”. When you have added this layer, and the others, the items in the Layer panel in the left of QGIS should look like this:

(\#fig:layers)The imported data layers shown in the Layer window.

Summary

Before moving on to the next chapter make sure you have.

Downloaded the example data
Imported the three files to QGIS

Bonus Exercises

Read about the disadvantages of shapefile and some of the alteratives http://switchfromshapefile.org/

Style and select features

Identify Features

The 'Identify features button' allows you to interrogate features on the map by clicking on them (see Figure @ref(fig:identify)).

(\#fig:identify)Identify Features Button

Clicking on a map feature brings up a panel with more information about that feature. You can use the mode menu at the bottom of the panel to select which layers will be included (see Figure @ref(fig:identify-panel)).

(\#fig:identify-panel)Identify Features Pannel

Symbology

In QGIS you can change the appearance of layers to communicate additional information. This is called symbology.

In the layers panel right click on "stats19" and select "properties", in the menu on the left select "Symbology"

(\#fig:symbology-menu)Symbology Menu

For the type of Symbology select "Categorised", the column "accident_severity", then chose a colour ramp from the drop-down menu. Then click "Classify". This will add all the different possible values from the data and assign each a unique colour. Click OK to return to the map.

(\#fig:symbology-results)Road collisons categoried by severity

Summary

Before moving onto the next chapter make sure you have.

Styled the stats19 points data

Bonus Exercises

Experiment using different columns and types of symbology

Hint: Think about how different types of symbology are appropriate to different types of data (categorical, discreet, continuous). Can you discover any interesting patterns in the data? How does your choice of breakpoints affect the presentation of the data?

Can you work out how to use transparency to overlay mulitple layers clearly?

Processing data

We saw in Chapter ... how to style and select features of interest from layers loaded into QGIS. In this section, we will learn how to process data. That means creating new data from existing data.

Reprojecting Data

Our two spatial datasets have different coordinate systems. This can make it difficult to make connections between the datasets. So we will reproject the stats19 data to the British National Grid.

In the vector, menu select Data management Tools, Reproject Layer

(\#fig:reproject-menu)Reproject Menu

(\#fig:crs-menu)Reproject MenuCordiante Reference System

Joining Data

Joining data allows you to link two separate datasets together by something they have in common. There are two types of join. Attribute joins (often just called joins) link two datasets by a common attribute such as an ID number. Spatial joins link dataset by a shared location.

In this next section, we will use a series of joins to find the area of Leeds with the highest rate of road collisions.

Attribute Joins

We will join the population data onto the LSOA boundaries.

Find “leeds_lsoa” in the Layers panel and right-click to bring up the context menu

Click on Properties

Select Joins from the options on the left

Click the green + button to open the Add Vector Join window

Select the following options:

Join layer: population

Join field: area

Target field: lsoa11cd

Click OK,

Click OK again

(\#fig:joins)The Add Vector Join window

Use the “Identify features” tool to see that each LSOA now has a population value.

Spatial Joins

We will assign an LSOA ID number to each road collision by doing a spatial join.
In the “Vector” menu, select “Data Management Tools”, then “Join Attributes by Location”.

(\#fig:joins-menu)Join Attributes by Location

In the Vector Menu, select Data Management Tools, Select Attributes by location, as shown in @ref(fig:joins-window).

(\#fig:joins-window)Join Attributes by Location Window

For the Input Layer select "stats19" and for the output later select "leeds_lsoa". For join type select "Create separate feature of each located feature". Then click run.

(\#fig:joins-result)Join Results

Once the process has completed a new layer will have been added to the map called "joined Layer" you can use the “Identify features” tool to see that each point in the stas19 data now has the

Points in Polygons

The final step for this chapter will be to count the number of road crash casualties in each LSOA.

In the Vector Menu, select Analysis Tools, Count Points in Polygons.

(\#fig:points-polygons)Points in Polygons Tool

For the Polygons choose your LSOA areas and the points the stats19 data. USe the number of casualties as a weighting field, and give the "count field name" an appropriate name.

A new layer will be created with the number of casualties for each LSOA. USe Symbology to visualise the most dangerous areas of Leeds.

(\#fig:final-plot)Number of road crash casualties in Leeds

Summary

Before moving onto the next chapter make sure you have.

Reprojected the stats19 data to the British National Grid
Done an attribute join of the population data to the LSOA areas
Done a spatial join of the LSOA areas to the stats19 points
Counted the number of casualties in each LSOA

Bonus Exercises

Can you use symbology to show the population of each LSOA?
Can you use symbology to show the number of casualties in each LSOA?

Making maps

Creating a print layout

While it is useful to view maps within QGIS it is usually necessary to export the maps for printing or to be included in a report.

From the project menu select "New Print Layout", you will be asked to give your new layout a name. You will then see a blank page.

(\#fig:print-layout)The print layout window, highlighting the add new map button

Use the "Add new map button" to draw a box where you wish you map to be on the page. Your map will appear within the box automatically.

(\#fig:print-withmap)The print layout window, with map added

You can pan and zoom within the map using the "Move Item Content" button

(\#fig:move-item-conten)Move Item content button

You can add several different features to the print layout such as titles, legends and scale bars using the buttons on the left side of the window. When an objected is selected you can edit its properties by using the "Item Properties" tab on the right side panel.

(\#fig:print-custom)A customised map

Saving & Exporting your map

You can save a map layout for later use using the save button on the main toolbar or using "Save project" in the "layout" menu. You can export your image in one of three formats.

Export as Image - A range of different formats including JPEG, PNG, GIF, TIF etc
Export as PDF - A PDF document
Export as SVG - A Scalable Vector Graphics image

All of these options and the option to print your map directly are in the "layout" menu.

Summary

This chapter has briefly introduced the print layout window.

Before moving onto the next chapter make sure you have.

Produced your own print layout including a map, title, scale bar, and legend.

Bonus Exercises

Experiment using different tools in the print layout window. Can you find out how to:

Pan and zoom the map to a specific area of interest
Customise the legend to only show some of the layers
Change the units on the scale bar
Customise the title font and appearance
Add the ITS logo to your print layout
Add an arrow pointing to your location

-->

Raster Data

So far we have only used Vector data with QGIS, this chapter will introduce raster data.

What is the difference between raster and vector data.

Vector data is made up of points, lines, and polygons with attributes. This makes it well suited to many GIS purposes. For example, we have already seen that the boundaries of an LSOA can be recorded as a polygon and that each polygon can have attributes like the area name, population etc.

Raster data is very different. It is essentially an image where each pixel has a value. Rasters are always rectangular and have a fixed resolution (so they become pixilated as you zoom in). A common use of raster data is satellite and aerial photography. These images are made from three overlapping rasters (often called a raster stack or raster brick). The three rasters represent the Red, Blue, Green colour bands which together make up a full-colour image. Raster can have more than three bands, for example, they may be used to represent changes over time or colours beyond human perception such as infra-red or ultraviolet.

Download Sample Data

Download the sample raster data from here:

https://github.com/ITSLeeds/QGIS-intro/releases/download/0.01/leeds_cir_compress.tif

Adding raster data to the map

Adding raster data to QGIS is done using the same data manager as vector data, except you must use the raster tab. You will notice that the raster contains an aerial photograph of Leeds, except the colours, appear to be wrong.

(\#fig:raster-import)The sample raster data added to the map

The colour difference is due to this raster being a colour infrared image. Rather than the usual Red, Green, Blue bands this image has Near Infrared (NI), Red, Green.

Normalized difference vegetation index

You may have noticed that the trees and grass in the raster appear bright red, but most other features appear grey. We shall use the raster to calculate the Normalized difference vegetation index (NDVI) a measure of how much vegetation is within each raster cell.

Within the "Raster" menu select "Raster Calculator" the formula for calculating the NVDI is:

$$\frac{NI - Red}{NI + Red}$$

Enter the formula into the raster calculator notice the use of @ to designate the different bands of the raster layer. Remember to specify where you want the results to be saved.

(\#fig:raster-calc)The raster calcualtor with the NDVI formula

Once the raster calculator is complete you should have a new raster layer. It will be in greyscale with values between -1 (least vegetated) and 1 most vegetated. You can adjust the symbology to make the vegetated areas clearer. In the figure, three colours are defined 0 (white), 0.2 (light green), 1 (dark green). These colours approximately make trees dark green, grass light green, and all non-vegetation white.

(\#fig:ndvi)The NDVI raster with a psudo-colour scheme applied

Linking Raster and vector data

Finally, we will link the NDVI raster back to the LSOA boundaries so that we can have an average vegetation score for each LSOA.

In the processing menu select "toolbox", this opens the processing toolbox panel on the right side. Use the search bar to find the "zonal statistics" tool.

(\#fig:zone-stats1)The Zonal Statistics tool in the processing toolbox

Complete the form to get statistics from the NDVI raster for each LSOA. When specifying the statistics to calculate select the mean.

(\#fig:zone-stats2)The Zonal Statistics tool

Zonal statistics may take several minutes to run. Once completed the mean NDVI value will be appended to the attribute table of the LSOA polygons.

Adjust the symbology of your LSOA layer to reflect the NDVI scores.

(\#fig:lsoa-ndvi)LSOA areas with average NDVI scores

Summary

This chapter has introduced raster data, the raster calculator, and zonal statistics.

Bonus Exercises

Consider how you could use the NDVI values to measure access to green spaces. How might you exclude small areas of green space such as gardens, but include large areas such as parks?

Accessibility analysis

Accessibility is a key concept in transport planning. One of the key aims of many transport policies is to make places more accessible to people, whether that's shops, schools or hospitals. But what does that actually mean?

In this example, we will use QGIS to explore accessibility to cycle infrastructure. Although this is an unnusual measure of accessibility, the methods can be modified and extended to explore other types of accessibility, for example accessibility to schools before and after 'school agglomeration', the topic of an academic paper based on a case study in Sao Paulo, Brazil [@moreno-monroy_public_2017].

Input data

Building on previous chapters, we will use a case study of Leeds. There are two additional datasets we'll use for this, saved in the file data_accessibility.zip in the releases section of the tutorial.

The input datasets were taken from 2 places:

The cycle infrastructure data was taken from they Cycling Infrastructure Prioritisation Tool (CyIPT)
The desire lines data were taken from the propensity to cycle tool (see pct.bike/m/?r=west-yorkshire) [@lovelace_propensity_2017].

Following the guidance in previous chapters, we pre-processed these datasets in 2 ways:

Reproject the datasets so they are in the British National Grid CRS (EPSG:27700)
Subset the desire lines so that we only have the top 200 across West Yorksire

Download and unzip the data_accessibility.zip file and load the .gpkg files. As a taster of what we will do, and to check you have the right input data, the results should look something like the map shown in Figure @ref(fig:access-overview).

(\#fig:access-overview)Overview of input data for this chapter.

Identify areas in Leeds with some cycle infrastructure

A simple but effective way of comparing 2 geographic layers, as a first step towards an accessibility indicator, is a spatial selection. We can do this to identify zones that have access to at least some cycle infrastructure:

Use the Select by location tool to select only those LSOAs with at least some cycleways in
Do this with the following menu options: Vector > Research Tools > Select by Location
Save the result as a new layer called, for example, leeds_lsoa_cycleway

The result should look like that shown in Figure @ref(fig:select-by-location).

(\#fig:select-by-location)Select by Location: results.

What proportion of zones have at least some cycle infrastructure, according to this measure?
What are the limitations of this approach?

Files

_main.knit.md

Latest commit

History

_main.knit.md

File metadata and controls

Preface {-}

Pre-requisites {-}

Introduction

What is GIS?

What is QGIS?

Working with QGIS

Opening QGIS

Key QGIS components

Plugins

Loading basemaps

Panning and Zooming

Projects and Files

Projections and Coordinate Reference Systems

Summary

Downloading and loading data {#data}

Importing spatial data

Importing a CSV or text file

CSV with spatial data

CSV without spatial data

Summary

Style and select features

Identify Features

Symbology

Summary

Processing data

Reprojecting Data

Joining Data

Attribute Joins

Spatial Joins

Points in Polygons

Summary

Making maps

Creating a print layout

Saving & Exporting your map

Summary

Raster Data

What is the difference between raster and vector data.

Download Sample Data

Adding raster data to the map

Normalized difference vegetation index

Linking Raster and vector data

Summary

Accessibility analysis

Input data

Identify areas in Leeds with some cycle infrastructure

Calculate distances to cycle infrastructure

Calculate cycle infrastructure provision along desire lines

Summary

References {-}