sfhotspot
sfhotspot provides functions to identify and understand clusters of points (typically representing the locations of places or events). All the functions in the package work on and produce simple features (SF) objects, which means they can be used as part of modern spatial analysis in R.
Installation
You can install the development version of sfhotspot from GitHub with:
# install.packages("remotes")
remotes::install_github("mpjashby/sfhotspot")
Functions
sfhotspot has the following functions. All can be used by just supplying an SF object containing points, or can be configured using the optional arguments to each function.
The results produced by hotspot_count()
, hotspot_change()
,
hotspot_kde()
, hotspot_dual_kde()
and hotspot_classify()
can be
easily plotted using included methods forautoplot()
and autolayer()
.
There are also included datasets:
memphis_robberies
, containing records of 2,245 robberies in Memphis, TN, in 2019.memphis_robberies_jan
, containing the same data but only for the 206 robberies recorded in January 2019.memphis_population
, containing population counts for the centroids of 10,393 census blocks in Memphis, TN, in 2020.
Example
We can use the hotspot_gistar()
function to identify cells in a
regular grid in which there are more/fewer points than would be expected
if the points were distributed randomly. In this example, the points
represent the locations of personal robberies in Memphis, which is a
dataset included with the package.
# Load packages
library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(sfhotspot)
library(tidyverse)
#> ── Attaching packages
#> ───────────────────────────────────────
#> tidyverse 1.3.2 ──
#> ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
#> ✔ tibble 3.1.8 ✔ dplyr 1.0.9
#> ✔ tidyr 1.2.0 ✔ stringr 1.4.0
#> ✔ readr 2.1.2 ✔ forcats 0.5.1
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
# Transform data to UTM zone 15N so that we can think in metres, not decimal
# degrees
memphis_robberies_utm <- st_transform(memphis_robberies, 32615)
# Identify hotspots, set all the parameters automatically by not specifying cell
# size, bandwidth, etc.
memphis_robberies_hotspots <- hotspot_gistar(memphis_robberies_utm)
#> Cell size set to 500 metres automatically
#> Bandwidth set to 5,592 metres automatically based on rule of thumb
# Visualise the hotspots by showing only those cells that have significantly
# more points than expected by chance. For those cells, show the estimated
# density of robberies.
memphis_robberies_hotspots %>%
filter(gistar > 0, pvalue < 0.05) %>%
ggplot(aes(colour = kde, fill = kde)) +
geom_sf() +
scale_colour_continuous(aesthetics = c("colour", "fill")) +
labs(title = "Density of robberies in Memphis, 2019") +
theme_void()