This tutorial introduces geospatial data visualization in R.
This tutorial is aimed at beginners and intermediate users of R with the aim of showcasing how to visualize geospatial data, i.e. how to generate maps in R, and how to prepare data for geospatial visualizations using R. The aim is not to provide a fully-fledged analysis but rather to show and exemplify selected useful methods associated with generating maps. Very recommendable and detailed resources for geospatial data visualization using R can be found here, here, or here. If you are interested in cartography, here is a cheat sheet for cartography with R.
The entire R Notebook for the tutorial can be downloaded here.
If you want to render the R Notebook on your machine, i.e. knitting the
document to html or a pdf, you need to make sure that you have R and
RStudio installed and you also need to download the bibliography
file and store it in the same folder where you store the
Rmd file.
Preparation and session set up
This tutorial is based on R. If you have not installed R or are new to it, you will find an introduction to and more information how to use R here. For this tutorials, we need to install certain packages from an R library so that the scripts shown below are executed without errors. Before turning to the code below, please install the packages by running the code below this paragraph. If you have already installed the packages mentioned below, then you can skip ahead and ignore this section. To install the necessary packages, simply run the following code - it may take some time (between 1 and 5 minutes to install all of the libraries so you do not need to worry if it takes some time).
NOTE
The installation of the packages will be relatively
data intensive due to some of the required packages containing shp-files
(shape files) - which renders these packages to be larger big in
comparison to other R packages. It is thus recommendable to be logged
into an institutional network that has a decent connectivity and
download rate (e.g., a university network).
install.packages("sf")
install.packages("raster")
install.packages("dplyr")
install.packages("spData")
install.packages("tmap")
install.packages("leaflet")
install.packages("ggplot2")
install.packages("spDataLarge",
repos = "https://nowosad.github.io/drat/", type = "source")
install.packages("ggspatial")
install.packages("rnaturalearth")
install.packages("rnaturalearthdata")
install.packages("ggmap")
install.packages("leaflet")
install.packages("scales")
install.packages("maps")
install.packages("here")
install.packages("oz")
# install klippy for copy-to-clipboard button in code chunks
install.packages("remotes")
remotes::install_github("rlesur/klippy")
devtools::install_github("cran/maptools")
devtools::install_github("cran/rgeos")
devtools::install_github("freysimon/TigR")
#install.packages("rgdal", repos="https://cloud.r-project.org/")
Now that we have installed the packages, we activate them as shown below.
library(sf)
library(raster)
library(dplyr)
library(spData)
library(spDataLarge)
library(tmap)
library(ggplot2)
library(ggspatial)
library(rnaturalearth)
library(ggmap)
library(leaflet)
library(maptools)
library(scales)
library(rgeos)
library(oz)
library(TigR)
# activate klippy for copy-to-clipboard button
klippy::klippy()
Once you have installed R and RStudio and also initiated the session by executing the code shown above, you are good to go.
We will start by generating maps of the world using a in-build
world
data set that is part of the spData
package and the plot
function for generating the map.
# Load the world data package
data(world)
plot(world)
We see that the world
data set contains information on
various factors, such as information about regions (e.g.,
continent
or subregion
), country names
(name_long
), population size (pop
) or life
expectancy (lifeExp
). We can use this information to show a
specific map as shown below.
plot(world["lifeExp"])
We can use the world
data set and filter for specific
features, e.g., we can visualize only a single continent by filtering
for the continent we are interested in. In addition, we define the
x-axis and y-axis limits so that we zoom in on the region of
interest.
# extract europe (exclude russia and iceland)
world_eur <- world %>%
dplyr::filter(continent == "Europe",
name_long != "Russian Federation",
name_long != "Iceland") %>%
dplyr::select(name_long, geom)
# plot
plot(world_eur,
xlim = c(5, 10),
ylim = c(30, 70),
main = "")
We can also overlay information such as population size over continents and countries as shown below.
# plot world map
plot(world["continent"], reset = FALSE)
# define size bases on population size
cex <- sqrt(world$pop) / 10000
# center world map
world_cents <- sf::st_centroid(world, of_largest = TRUE)
# plot
plot(sf::st_geometry(world_cents),
add = TRUE,
cex = cex)
Overlaying is interesting because it allows us to highlight certain regions or countries as shown below.
# extract map of europe
world_eur <- world %>%
dplyr::filter(continent == "Europe")
# extract germany
ger <- world %>%
dplyr::filter(name_long == "Germany")
# plot germany
plot(sf::st_geometry(ger), expandBB = c(.2, .2, .2, .2), col = "gray", lwd = 3)
# plot europe
plot(world_eur[0], add = TRUE)
We can also add information to the world
data set and
use the added information to generate customized plots.
# countries I have been to
countries <- c("United States", "Norway", "France", "United Arab Emirates",
"Qatar", "Sweden", "Poland", "Austria", "Hungary", "Romania",
"Germany", "Bulgaria", "Greece", "Turkey", "Croatia",
"Switzerland", "Belgium", "Netherlands", "Spain", "Ireland",
"Australia", "China", "Italy", "Denmark", "United Kingdom",
"Slovenia", "Finland", "Slovakia", "Czech Republic", "Japan",
"Saudi Arabia", "Serbia")
# data frame with countries I have visited
visited <- world %>%
dplyr::filter(name_long %in% countries)
# plot world
plot(world[0], col = "lightgray")
# overlay countries I have visited in orange
plot(sf::st_geometry(visited), add = TRUE, col = "orange")
If you want to plot Australia, you can also simply use the
oz
package.
So far, we have used the base plot
function to generate
maps. However, it is also possible to use ggplot2
to
generate maps and the easiest way is to use borders
to draw
a map.
# plot map
ggplot() +
borders()
Another option is to add geom_sf
to a
ggplot2
object as shown below. A nice feature is that we
can add perspective and projection.
# plot map
ggplot(data = world) +
geom_sf(fill = "white") +
coord_sf(crs = "+proj=laea +lat_1=-28 +lat_2=-36 +lat_0=-32 +lon_0=135 +x_0=1000000 +y_0=2000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")
Or have a map that shows the world in a bit of an unusual perspective due to the Labert projection.
# plot map
ggplot(data = world) +
geom_sf(fill = "beige") +
coord_sf(crs = "+proj=lcc +lat_1=-28 +lat_2=-36 +lat_0=-32 +lon_0=135 +x_0=1000000 +y_0=2000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs") +
theme(panel.grid.major = element_line(color = "gray50",
size = 0.25),
panel.background = element_rect(fill = "aliceblue"))
The nice thing about ggplot2
is, of course, that it is
very easy to add layers and create very pretty visualizations.
# plot map
ggplot(data = world) +
geom_sf() +
theme_bw() +
# adding axes title
labs(x = "Longitude", y = "Latitude") +
# adding title and subtitle
ggtitle("A map of Australia") +
# defining coordinates
coord_sf(xlim = c(100.00, 160.00),
ylim = c(-45.00, -10.00),
expand = T) +
# add distance measure
annotation_scale(location = "bl", width_hint = 0.5) +
# add compass
annotation_north_arrow(location = "br")
Again, we can customize the map according to what we want. In
addition, we load a map with a higher resolution using the
ne_countries
function from the rnaturalearth
package.
# load data
world <- rnaturalearth::ne_countries(returnclass = "sf")
# add to prevent errors
sf::sf_use_s2(FALSE)
# extract locations
world_points<- st_centroid(world)
# extract labels
world_points <- cbind(world, sf::st_coordinates(sf::st_centroid(world$geometry)))
# generate annotated world map
ggplot(data = world) +
# land is gray
geom_sf(fill= "gray90") +
# axes labels
labs(x = "Longitude", y = "Latitude") +
# define zoom
coord_sf(xlim = c(100.00, 180.00),
ylim = c(-45.00, -10.00), expand = T) +
# add scale bar
annotation_scale(location = "bl", width_hint = 0.5) +
# add compass
annotation_north_arrow(location = "br", which_north = "true",
style = north_arrow_fancy_orienteering) +
# define theme (add grid lines)
theme(panel.grid.major = element_line(color = "gray60",
linetype = "dashed",
size = 0.25),
# define background color
panel.background = element_rect(fill = "aliceblue")) +
# add text
geom_text(data= world_points,aes(x=X, y=Y, label=name),
color = "gray20", fontface = "italic", check_overlap = T, size = 3)
We can explore other designs and maps and show different regions of the world.
# load data
europe <- ne_countries(scale = "medium", continent='europe', returnclass = "sf")
# plot map
ggplot(data = europe) +
# add map and define filling
geom_sf(mapping = aes(fill = ifelse(name_long == "Germany", "0", "1"))) +
# simply black and white background
theme_bw() +
# adding axes title
labs(x = "Longitude", y = "Latitude") +
# adding title and subtitle
ggtitle("A map of central Europe") +
# defining coordinates
coord_sf(xlim = c(-10, 30),
ylim = c(40, 60)) +
# add distance measure
annotation_scale(location = "bl", width_hint = 0.5) +
# add compass
annotation_north_arrow(location = "br",
# make compass fancy
style = north_arrow_fancy_orienteering) +
theme(legend.position = "none",
# add background color
panel.background = element_rect(fill = "lightblue"),
# remove grid lines
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
# define fill colors
scale_fill_manual(name = "Country", values = c("darkgray", "beige")) +
# add text
geom_sf_text(aes(label = name_long), size=2.5, color = "gray20")
While being very useful, displaying basic maps is usually less relevant because, typically, we want to add different layers to a map. In order to add layers to a map, we need to combine the existing data with some data that we would like to lay over the existing map.
We will now lad external information which contains the locations of Australian airports.
# load data
airports <- base::readRDS(url("https://slcladal.github.io/data/apd.rda", "rb")) %>%
dplyr::mutate(ID = as.character(ID)) %>%
dplyr::filter(Country == "Australia")
# inspect
head(airports)
## ID Name City Country Latitude
## 1 3317 Brisbane Archerfield Airport Brisbane Australia -27.5703
## 2 3318 Northern Peninsula Airport Amberley Australia -10.9508
## 3 3319 Alice Springs Airport Alice Springs Australia -23.8067
## 4 3320 Brisbane International Airport Brisbane Australia -27.3842
## 5 3321 Gold Coast Airport Coolangatta Australia -28.1644
## 6 3322 Cairns International Airport Cairns Australia -16.8858
## Longitude
## 1 153.008
## 2 142.459
## 3 133.902
## 4 153.117
## 5 153.505
## 6 145.755
Next, we load an additional data set about the route volume of Australian airports (how many routes go in and out of these airports).
# read in routes data
routes <- base::readRDS(url("https://slcladal.github.io/data/ard.rda", "rb")) %>%
dplyr::rename(ID = destinationAirportID) %>%
dplyr::group_by(ID) %>%
dplyr::summarise(flights = n())
# inspect
head(routes)
## # A tibble: 6 × 2
## ID flights
## <chr> <int>
## 1 1 5
## 2 10 2
## 3 100 44
## 4 1001 4
## 5 1004 7
## 6 1005 31
We can now merge the airports and the route volume data sets to combine the information about the location with the information about the number of routes that end at each airport.
# combine tables
arrivals <- dplyr::left_join(airports, routes, by = "ID") %>%
na.omit()
# inspect
head(arrivals)
## ID Name City Country Latitude
## 2 3318 Northern Peninsula Airport Amberley Australia -10.9508
## 3 3319 Alice Springs Airport Alice Springs Australia -23.8067
## 4 3320 Brisbane International Airport Brisbane Australia -27.3842
## 5 3321 Gold Coast Airport Coolangatta Australia -28.1644
## 6 3322 Cairns International Airport Cairns Australia -16.8858
## 7 3323 Charleville Airport Charlieville Australia -26.4133
## Longitude flights
## 2 142.459 1
## 3 133.902 13
## 4 153.117 144
## 5 153.505 28
## 6 145.755 54
## 7 146.262 4
Now that we have that data (which contains geolocations (longitudes and latitudes), we can visualize the location of the airports on a map and add information about the route volume of the airports in the form of, e.g., points that we plot over the airport - the bigger the point, the higher the route volume. In addition, we add the locations as texts and also make these labels correspond to the route volume.
# create a layer of borders
ggplot(arrivals, aes(x=Longitude, y= Latitude)) +
borders("world", colour="gray20", fill="wheat1") +
geom_point(color="blue", alpha = .3, size = log(arrivals$flights)) +
scale_x_continuous(name="Longitude", limits=c(110, 160)) +
scale_y_continuous(name="Latitude", limits=c(-45, -10)) +
theme(panel.background = element_rect(fill = "azure1", colour = "azure1")) +
geom_text(aes(x=Longitude, y= Latitude, label=City),
color = "gray20", check_overlap = T, size = log(arrivals$flights))
The leaflet
package offers very easy-to use options for
generating interactive maps (here
is the link to the leaflet cheat sheet provided by RStudio). The
interactivity is achieved by the leaflet
function from the
leaflet
package which creates a leaflet-map with
html-widgets which can be used, e.g., in html rendered R Notebooks or
Shiny applications. The advantage of using this function lies in the
fact that it offers very detailed maps which enable zooming in on
specific locations.
# generate basic leaflet map
m <- leaflet() %>%
leaflet::setView(lng = 153.05, lat = -27.45, zoom = 12)%>%
leaflet::addTiles()
# show map
m
Another option for interactive geospatial visualizations is provided
by the maptools
package which comes with a
SpatialPolygonsDataFrame
of the world and the population by
country (in 2005). To make the visualization a bit more appealing, we
will calculate the population density, add this variable to the data
which underlies the visualization, and then display the information
interactively. In this case, this means that you can use
mouse-over or hoover effects so that you see the
population density in each country if you put the cursor on that country
(given the information is available for that country).
We start by loading the required package from the library, adding population density to the data, and removing data points without meaningful information (e.g. we set values like Inf to NA).
# load data
data(wrld_simpl)
# calculate population density and add it to the data
wrld_simpl@data$PopulationDensity <- round(wrld_simpl@data$POP2005/wrld_simpl@data$AREA,2)
wrld_simpl@data$PopulationDensity <- ifelse(wrld_simpl@data$PopulationDensity == "Inf", NA, wrld_simpl@data$PopulationDensity)
wrld_simpl@data$PopulationDensity <- ifelse(wrld_simpl@data$PopulationDensity == "NaN", NA, wrld_simpl@data$PopulationDensity)
# inspect
head(wrld_simpl@data, 10)
## FIPS ISO2 ISO3 UN NAME AREA POP2005 REGION SUBREGION
## ATG AC AG ATG 28 Antigua and Barbuda 44 83039 19 29
## DZA AG DZ DZA 12 Algeria 238174 32854159 2 15
## AZE AJ AZ AZE 31 Azerbaijan 8260 8352021 142 145
## ALB AL AL ALB 8 Albania 2740 3153731 150 39
## ARM AM AM ARM 51 Armenia 2820 3017661 142 145
## AGO AO AO AGO 24 Angola 124670 16095214 2 17
## ASM AQ AS ASM 16 American Samoa 20 64051 9 61
## ARG AR AR ARG 32 Argentina 273669 38747148 19 5
## AUS AS AU AUS 36 Australia 768230 20310208 9 53
## BHR BA BH BHR 48 Bahrain 71 724788 142 145
## LON LAT PopulationDensity
## ATG -61.783 17.078 1887.25
## DZA 2.632 28.163 137.94
## AZE 47.395 40.430 1011.14
## ALB 20.068 41.143 1151.00
## ARM 44.563 40.534 1070.09
## AGO 17.544 -12.296 129.10
## ASM -170.730 -14.318 3202.55
## ARG -65.167 -35.377 141.58
## AUS 136.189 -24.973 26.44
## BHR 50.562 26.019 10208.28
We can now display the data and use color coding to indicate the different population densities.
# define colors
qpal <- colorQuantile(rev(viridis::viridis(10)),
wrld_simpl$PopulationDensity, n=10)
# generate visualization
l <- leaflet(wrld_simpl, options =
leafletOptions(attributionControl = FALSE, minzoom=1.5)) %>%
addPolygons(
label=~stringr::str_c(
NAME, ' ',
formatC(PopulationDensity, big.mark = ',', format='d')),
labelOptions= labelOptions(direction = 'auto'),
weight=1, color='#333333', opacity=1,
fillColor = ~qpal(PopulationDensity), fillOpacity = 1,
highlightOptions = highlightOptions(
color='#000000', weight = 2,
bringToFront = TRUE, sendToBack = TRUE)
) %>%
addLegend(
"topright", pal = qpal, values = ~PopulationDensity,
title = htmltools::HTML("Population density <br> (2005)"),
opacity = 1 )
# display visualization
l
In order to work with shape files, you need to download the shape file(s) first. You can download the shape file we will be using from the shapes directory in the data folder of the LADAL GitHub repository.
I stored the shape file in my data
folder which then
allows me to access the shape file and use it for further
visualizations.
australia <- read_sf(here::here("data/shapes", "AshmoreAndCartierIslands.shp"),
layer='AshmoreAndCartierIslands')
We can now generate a first map based on the shape file we downloaded.
ggplot(australia) +
geom_sf(colour = "black", fill = "gray90") +
theme_void()
We will end this introduction here but if you want to want to learn more, check out the detailed resources for geospatial data visualization using R can be found here or here.
Schweinberger, Martin. 2024. Introduction to Geospatial Data Visualization with R. Brisbane: The University of Queensland. url: https://slcladal.github.io/gviz.html (Version 2023.02.09).
@manual{schweinberger2023gviz,
author = {Schweinberger, Martin},
title = {Introduction to Geospatial Data Visualization with R},
note = {https://ladal.edu.au/gviz.html},
year = {2023},
organization = "The University of Queensland, Australia. School of Languages and Cultures},
address = {Brisbane},
edition = {2023.02.09}
}
sessionInfo()
## R version 4.4.1 (2024-06-14 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8
## [3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C
## [5] LC_TIME=English_Australia.utf8
##
## time zone: Australia/Brisbane
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] TigR_0.4.0.12 oz_1.0-22 rgeos_0.6-4
## [4] scales_1.3.0 maptools_1.1-8 leaflet_2.2.2
## [7] ggmap_4.0.0 rnaturalearth_1.0.1 ggspatial_1.1.9
## [10] ggplot2_3.5.1 tmap_3.3-4 spDataLarge_2.1.2
## [13] spData_2.3.3 dplyr_1.1.4 raster_3.6-30
## [16] sp_2.1-4 sf_1.0-18
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.3 bitops_1.0-9 pbapply_1.7-2
## [4] gridExtra_2.3 tmaptools_3.1-1 s2_1.1.7
## [7] rlang_1.1.4 magrittr_2.0.3 e1071_1.7-16
## [10] compiler_4.4.1 maps_3.4.2 png_0.1-8
## [13] vctrs_0.6.5 stringr_1.5.1 wk_0.9.4
## [16] pkgconfig_2.0.3 fastmap_1.2.0 labeling_0.4.3
## [19] lwgeom_0.2-14 leafem_0.2.3 utf8_1.2.4
## [22] rmarkdown_2.28 purrr_1.0.2 xfun_0.49
## [25] cachem_1.1.0 jsonlite_1.8.9 highr_0.11
## [28] jpeg_0.1-10 terra_1.7-83 parallel_4.4.1
## [31] R6_2.5.1 dygraphs_1.1.1.6 bslib_0.8.0
## [34] stringi_1.8.4 RColorBrewer_1.1-3 parallelly_1.38.0
## [37] jquerylib_0.1.4 stars_0.6-6 assertthat_0.2.1
## [40] Rcpp_1.0.13-1 knitr_1.48 future.apply_1.11.3
## [43] klippy_0.0.0.9500 zoo_1.8-12 base64enc_0.1-3
## [46] R.utils_2.12.3 leaflet.providers_2.0.0 tidyselect_1.2.1
## [49] viridis_0.6.5 rstudioapi_0.17.1 dichromat_2.0-0.1
## [52] abind_1.4-8 yaml_2.3.10 codetools_0.2-20
## [55] listenv_0.9.1 lattice_0.22-6 tibble_3.2.1
## [58] leafsync_0.1.0 plyr_1.8.9 withr_3.0.2
## [61] evaluate_1.0.1 hydroTSM_0.7-0.1 foreign_0.8-87
## [64] future_1.34.0 rnaturalearthdata_1.0.0 units_0.8-5
## [67] proxy_0.4-27 xts_0.14.1 pillar_1.9.0
## [70] KernSmooth_2.23-24 generics_0.1.3 rprojroot_2.0.4
## [73] munsell_0.5.1 globals_0.16.3 class_7.3-22
## [76] glue_1.8.0 tools_4.4.1 data.table_1.16.2
## [79] XML_3.99-0.17 grid_4.4.1 tidyr_1.3.1
## [82] crosstalk_1.2.1 hydroGOF_0.6-0.1 colorspace_2.1-1
## [85] cli_3.6.3 fansi_1.0.6 viridisLite_0.4.2
## [88] gtable_0.3.5 R.methodsS3_1.8.2 sass_0.4.9
## [91] digest_0.6.37 classInt_0.4-10 farver_2.1.2
## [94] htmlwidgets_1.6.4 htmltools_0.5.8.1 R.oo_1.27.0
## [97] lifecycle_1.0.4 here_1.0.1 httr_1.4.7