eval=TRUE to run this code when knitting:
::opts_chunk$set(eval = FALSE)knitr
The goal of the
pct package is to increase the
accessibility and reproducibility of the outputs from the Propensity to
Cycle Tool (PCT), a research project and web application hosted at www.pct.bike. The tool is one of just
government websites exempt from the requirement to transition to the
.gov.uk domain name, and is a recommended source of evidence in the
preparation of Local Walking and Cycling plans (LCWIPs), as outlined in
guidance1, supporting the
Cycling and Walking Infrastructure Strategy (CWIS),
an amendment of the Infrastructure
Act 2015. For an overview of the data provided by the PCT, clicking
on the previous link and trying it out is a great place to start. An
academic paper on
the PCT provides detail on the motivations and methods underlying the
Since work on the package began in 2015, for example during the ODI Leeds Hack my Route hackathon (see an early prototype of the tool here), the features and demand for the PCT have evolved substantially. In early 2019, for example, the School travel layer was added to the main PCT site to provide evidence nationwide on the potential benefits of scenarios of cycling uptake, and where safe routes to school should be prioritised3. In fact, a major aim of the PCT was to enable people to extend the tool2:
We envision stakeholders in local government modifying scenarios for their own purposes, and that academics in relevant fields may add new features and develop new use cases of the PCT.
Motivated by this vision of adaptable transport planning tools, this introductory vignette demonstrates how the package works with an example from the Isle of Wight, an island just off the southern coast of Britain, with a population of ~140,000 people. Before demonstrating some of the package’s key functions, it’s worth providing a little context.
The Propensity to Cycle Tool was commissioned by the UK’s Department for Transport to help planners and others prioritise investment and policies to get people cycling, as outlined in the Government report National propensity to cycle: full report with annexes4. However, the academic team leading the project had a wider sub-aim: of making transport evidence more accessible, encouraging evidence-based transport policies, and encouraging a more democratic transport planning process, and that means open transport data and open source transport modelling tools2.
The code base underlying the PCT is publicly available (see github.com/npct). However, the code hosted there is not easy to run or reproduce, which is where this package comes in: it provides quick access to the data underlying the PCT and enables some of the key results to be reproduced quickly. It was developed primarily for educational purposes (including for upcoming PCT training courses) but it may be useful for people to build on the the methods, for example to create a scenario of cycling uptake in their town/city/region.
In summary, if you want to know how PCT works, be able to reproduce some of its results, and build scenarios of cycling uptake to inform transport policies enabling cycling in cities worldwide, this package is for you!
You can install the development version of the package as follows:
Load the package as follows:
We will also use the following packages in this tutorial:
library(sf) library(dplyr) library(stplanr) library(leaflet) library(ggplot2) library(pbapply)
From feedback, we hear that the use of the data is critical in decision making. Therefore, one area where the package could be useful is making the data “easily” available to be processed.
To download the data within www.pct.bike, we have added a suite of functions:
There are other
get_() functions that get official data
underlying the PCT, as we will see in a later section. For now, let’s
see how the functions work. To get the centroids in Isle of Wight at the
lower-resolution (smaller files) MSOA level (LSOA level data is returned
by default or by replacing
the code below) you would run:
= get_pct_centroids(region = "isle-of-wight", geography = "msoa") wight_centroids = get_pct_zones(region = "isle-of-wight", geography = "msoa")wight_zones
Let’s verify that the data gave us what we would expect to see:
plot(wight_centroids[, "bicycle"]) plot(wight_zones[, "bicycle"])
The results are indeed as we would expect, with the centroid data showing points and the zone data showing zones. The zones with higher cycling levels are in the more densely populated south of the island, as we would expect. Likewise, the following command downloads the desire lines for the Isle of Wight:
= get_pct_lines(region = "isle-of-wight", geography = "msoa")wight_lines_pct
The rest of the
get_pct_ functions are similar to the
above two examples and download data from www.pct.bike.
However, the base of these functions is
takes the following arguments:
base_url = "https://github.com/npct/pct-outputs-regional-R/raw/master": just in case if you wanted to download the data from a similar server
purpose = "commute": soon there will be “schools” and maybe other modes, but currently commute is the only option.
geography = "msoa": MSOA or LSOA
region = NULL: regions within
layer = NULL: one of
rq(routes quiet) or
extension = ".Rds"as PCT data is available in various formats. For the purpose of this package we have made the default option of “Rds”.
To compare the downloaded data with data in the PCT web app, we will
take a subset of the
wight_lines_pct dataset. The top 30
travelled desire lines by number of commuters who use cycling as their
main mode is taken in the following code chunk. The reason for selecting
the top 30 will become apparent (the
is provided in the PCT package):
= wight_lines_pct %>% wight_lines_30 top_n(30, bicycle)
wight_lines_30 datasets are available in the package. We’ll
use the smaller one for speed. Note: these contain many variables, three
of which (the number of people cycling, driving and walking along the
desire lines from the 2011 Census) are shown below for the Isle of
= wight_lines_30$all / mean(wight_lines_30$all) * 5 lwd plot(wight_lines_30[c("bicycle", "car_driver", "foot")], lwd = lwd)
To provide another view of the data, focus on cycling, let’s create a leaflet map:
= colorNumeric(palette = "RdYlBu", domain = wight_lines_30$bicycle) pal leaflet(data = wight_lines_30) %>% addTiles() %>% addPolylines(weight = lwd, color = ~ pal(bicycle)) %>% addLegend(pal = pal, values = ~bicycle)
There was a reason for selecting the top 30 lines: it mirrors the view of the desire lines available from the PCT web application for the island, available at www.pct.bike/m/?r=isle-of-wight (note that Straight Lines is selected from the Cycling Flows dropdown menu in the image below, and by default shows the top 30 flows by number of bicycle trips).
The previous section showed that data downloaded with
get_pct*() functions get the results generated by
the PCT. However, they do not reproduce the results generated
by the PCT, starting from first principles and publicly available,
official data. Underlying the PCT is origin-destination data from the
2011 Census. The MSOA-level data is open access, so we only provide
access to this dataset. The following command gets the
origin-destination data for the Isle of Wight:
= get_od(region = "wight")wight_od_all
summary(wight_od_all$geo_code1 %in% wight_centroids$geo_code) summary(wight_od_all$geo_code2 %in% wight_centroids$geo_code)
Note that all the origin codes match the Isle of Wight centroid
codes, but most of the destination zones do not. This is because many
people on the island work outside the island.
default returns only OD pairs in which the commute trips originate from
To make the dataset smaller and simpler, let’s subset it so it only
contains OD pairs in which the origin and destination are in
the island (the resulting
wight_od data is provided in the
= wight_od_all %>% wight_od filter(geo_code2 %in% wight_centroids$geo_code)
To convert the results to geographic desire lines, we can use the
od2line() from the
= od2line(wight_od, wight_centroids) wight_lines nrow(wight_lines) sum(wight_lines$all)
The previous code chunk downloads and processes 324
origin-destination pairs, representing inter-zonal commuting trips made
by 42,139 people on the island (population: 140,000). By
default, the function includes intra-zonal flows, but these can be
omitted as follows (the argument
get_od() does the same thing):
= wight_lines %>% wight_lines_census filter(geo_code1 != geo_code2) nrow(wight_lines_census) sum(wight_lines_census$all)
Another OD data processing step developed for the PCT was converting oneway lines into 2 way lines. This can be done as follows:
= od_oneway( wight_lines_census1 wight_lines_census,attrib = c("all", "bicycle") )nrow(wight_lines_census1) / nrow(wight_lines_census) sum(wight_lines_census1$all) / sum(wight_lines_census$all)
Note that the resulting lines contain 50% of the number of lines, but the same number of trips: this is because 2 separate lines between the same zones have been converted into 1 line representing the combined number of trips in both directions, for each OD pair. This step is not essential but it has a couple of advantages: it was used in the PCT to make the routing more computationally efficient (less work computing the same route twice); and it makes visualising the lines and routes simpler.
Now that the lines data contains data on 2 way trips between zones, we can estimate routes (note: the results on the PCT website contain estimated uptake levels from intrazonal flow) Visually, this involves converting the straight desire lines shown in the previous map into routes that can be cycled, as shown in the next code chunk. Note: this code does not run dynamically, because you need an CycleStreets.net API key for this, and it takes some time:
= route( wight_routes_fast l = wight_lines_census1, route_fun = cyclestreets::journey, plan = "fastest")
You can download these routes as follows:
= "https://github.com/ITSLeeds/pct/releases/download/0.5.0/wight_routes_fast.Rds" u = readRDS(url(u))wight_routes_fast
A sample of these is provided in the package as
wight_routes_30_cs, which was generated as follows:
= wight_lines_census1 %>% wight_lines_census_30 top_n(30, bicycle)
= wight_routes_fast %>% wight_routes_30_cs group_by(geo_code1, geo_code2) %>% summarise( all = mean(all), bicycle = mean(bicycle), av_incline = weighted.mean(gradient_smooth, w = distances), length = sum(distances), time = sum(time) %>% ) ungroup() %>% top_n(30, bicycle)
A simple verification that we have the right desire lines matched to the routes involves plotting the Euclidean vs Route distance, e.g. as follows:
= as.numeric(st_length(wight_lines_census_30)) / 1000 d plot(d, wight_routes_30_cs$length / 1000, xlim = c(0, 10)) abline(a = c(0, 1))
How well does that match the route distance data downloaded from the PCT?
Almost perfectly for most of the routes. Differences can be explained by changes in infrastructure since the PCT results were first generated (these will be updated in the Propensity to Cycle Tool on-line data later in 2019).
We now have everything needed to estimate cycling uptake for each desire lines on the Isle of Wight (we’ll do the calculation on the top 30 by current cycling levels).
Functions named with
uptake_*() estimate cycling
uptake_pct_godutch(): generates the “GoDutch” scenario level of cycling based on a particular route’s hilliness percentage and length.
uptake_pct_govtarget(): generates the UK government target again based on the hilliness and length parameters.
We will estimate cycling potential with
uptake_pct_godutch(), using the
av_incline from the
= uptake_pct_govtarget( pcycle_govtarget distance = wight_routes_30_cs$length, gradient = wight_routes_30_cs$av_incline * 100 )
In terms of cycling uptake, the results are shown below:
$govtarget = wight_lines_census_30$bicycle + wight_routes_30_cs* wight_lines_census_30$all pcycle_govtarget $govtarget_pct = wight_lines_30$govtarget_slc wight_routes_30_cs ggplot(wight_routes_30_cs) + geom_point(aes(length, govtarget), colour = "red") + geom_point(aes(length, govtarget_pct), colour = "blue") cor(wight_routes_30_cs$govtarget, wight_routes_30_cs$govtarget_pct)
The final computational stage is also one of the most important from
a policy perspective: estimating cycling potential down to the street
level, to help prioritise investment where it is most needed. This work
is done by the
overline2() function, as follows:
= sf::st_cast(wight_routes_30_cs, "LINESTRING") wight_routes_30_ls = overline(wight_routes_30_ls, "govtarget") rnet plot(rnet)
Running the same function for all routes in
wight_routes_fast, generates the packaged data object
wight_rnet, which was created as follows:
= wight_routes_fast %>% wight_routes_fast_gt group_by(geo_code1, geo_code2) %>% mutate( govtarget = uptake_pct_govtarget(sum(distances), mean(gradient_smooth)) * sum(all) + sum(bicycle)) ( )= sf::st_cast(wight_routes_fast_gt, "LINESTRING") wight_routes_fast_gt = overline(wight_routes_fast_gt, "govtarget")wight_rnet
= colorNumeric(palette = "RdYlBu", domain = wight_rnet$govtarget) pal leaflet(data = wight_rnet) %>% addTiles() %>% addPolylines(color = ~ pal(govtarget)) %>% addLegend(pal = pal, values = ~govtarget)