In this guide we will cover how to create and use point proximity buffers in R. Buffer analysis is used for identifying areas surrounding geographic features. The process involves generating a circle with a radius r around existing geographic features. Buffer analysis is a form of distance analysis. In this case, you connect other features based on whether they fall inside or outside the boundary of the buffer. In this guide we will extend the San Francisco Break-ins data from Lab 6. We’ll detect whether car breaks-ins tend to be clustered near certain buildings.

Setting up the data


We’ll need to load required packages. You should have already installed these packages in prior labs, so no need to run the install.packages() function.

#Load necessary packages
library(sf)
library(tidyverse)
library(tmap)

We’ll then bring in four data files: (1) a shapefile of San Francisco Census tracts (2) a csv containing car breaks-in (3) a csv of San Francisco Elementary schools and (4) a list of active businesses in San Francisco. I uploaded all the files on GitHub. The code for bringing in the files and cleaning them are below. We won’t go through each line of code in detail because we’ve covered all of these operations and functions in Lab 6. We’ve embedded comments within the code briefly explaining what each chunk is doing, but go back to Lab 6 if you need further help.

#Download zip file containing San Francisco city tracts shapefile
setwd("insert your pathway here")
download.file(url = "https://raw.githubusercontent.com/crd150/data/master/sf_tracts.zip", destfile = "sf_tracts.zip")
unzip(zipfile = "sf_tracts.zip")

sf.tracts <- st_read("sf_tracts.shp")
sf.tracts.utm <- st_transform(sf.tracts, crs = "+proj=utm +zone=10 +datum=NAD83 +ellps=GRS80")
#Download car-break ins.   
break.df <- read_csv("https://raw.githubusercontent.com/crd150/data/master/Car_Break-ins.csv")
#Make into sf object using San Francisco tracts CRS.
break.sf <- st_as_sf(break.df, coords = c("X", "Y"), crs = st_crs(sf.tracts))
#Reproject into UTM Zone 10
break.sf.utm <- st_transform(break.sf, crs = "+proj=utm +zone=10 +datum=NAD83 +ellps=GRS80")
#Download elementary schools
elem.df <- read_csv("https://raw.githubusercontent.com/crd150/data/master/elementary_schools.csv")
#Make into sf object using San Francisco tracts CRS.
elem.sf <- st_as_sf(elem.df, coords = c("longitude", "latitude"), crs = st_crs(sf.tracts))
#Reproject into UTM Zone 10
elem.sf.utm <- st_transform(elem.sf, crs = "+proj=utm +zone=10 +datum=NAD83 +ellps=GRS80")
#Download businesses
bus.df <- read_csv("https://raw.githubusercontent.com/crd150/data/master/sf_businesses_2018.csv")
#Make into sf object using San Francisco tracts CRS.
bus.sf <- st_as_sf(bus.df, coords = c("lon", "lat"), crs = st_crs(sf.tracts))
#Reproject into UTM Zone 10
bus.sf.utm <- st_transform(bus.sf, crs = "+proj=utm +zone=10 +datum=NAD83 +ellps=GRS80")
#Keeps car breaks-ins within San Fracisco city tract boundaries
subset.int<-st_intersects(break.sf.utm, sf.tracts.utm)
subset.int.log = lengths(subset.int) > 0
break.sf.utm <- filter(break.sf.utm, subset.int.log)

#Keeps businesses within San Fracisco city tract boundaries
subset.int<-st_intersects(bus.sf.utm, sf.tracts.utm)
subset.int.log = lengths(subset.int) > 0
bus.sf.utm <- filter(bus.sf.utm, subset.int.log)

Creating buffers


The first task is to visually inspect car break-ins around elementary schools. That is, do we see clustering of points around schools? We’ll answer this question by creating buffers around schools.

The size of the buffer depends on the city you are examining and the context of your question. In this case, let’s use 150 meter buffers, which is the average size of a San Francisco city block.

We use the sf function st_buffer() to create buffers. The required arguments are your sf object and the distance.

elem.buff <-st_buffer(elem.sf.utm, 150)
elem.buff
## Simple feature collection with 51 features and 15 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 543716.5 ymin: 4173681 xmax: 554695.3 ymax: 4184167
## epsg (SRID):    26910
## proj4string:    +proj=utm +zone=10 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
## # A tibble: 51 x 16
##    `Campus Name` `CCSF Entity` `Lower Grade` `Upper Grade` `Grade Range`
##    <chr>         <chr>                 <int>         <int> <chr>        
##  1 Alamo Elemen… SFUSD                     0             5 K-5          
##  2 Alvarado Ele… SFUSD                     0             5 K-5          
##  3 Argonne Elem… SFUSD                     0             5 K-5          
##  4 Carver, Dr. … SFUSD                     0             5 K-5          
##  5 Chin, John Y… SFUSD                     0             5 K-5          
##  6 Chinese Educ… SFUSD                     0             5 K-5          
##  7 Clarendon El… SFUSD                     0             5 K-5          
##  8 Cleveland El… SFUSD                     0             5 K-5          
##  9 El Dorado El… SFUSD                     0             5 K-5          
## 10 Feinstein, D… SFUSD                     0             5 K-5          
## # ... with 41 more rows, and 11 more variables: Category <chr>, `Map
## #   Label` <chr>, `Lower Age` <int>, `Upper Age` <int>, `General
## #   Type` <chr>, `CDS Code` <dbl>, `Campus Address` <chr>, `Supervisor
## #   District` <int>, `County FIPS` <int>, `County Name` <chr>,
## #   geometry <POLYGON [m]>

You’ll see that elem.buff is a polygon object like a census tract. To be clear regarding what a buffer is, let’s extract one school, El Dorado Elementary School (ID “PS028”), and its 150 meter buffer.

ex1 <- filter(elem.buff, `Map Label` == "PS028")
ex2 <- filter(elem.sf.utm, `Map Label` == "PS028")

And let’s map it onto tracts with break-ins.

ex.map<-tm_shape(sf.tracts.utm) +
          tm_polygons() +
        tm_shape(ex1) +
          tm_borders(col="red") +  
        tm_shape(ex2) +
          tm_dots(col = "red") +
        tm_shape(break.sf.utm) +
          tm_dots() 
tmap_mode("view")
## tmap mode set to interactive viewing
ex.map 

The school we are mapping is located in the bottom right of the city. If you zoom in (Figure 1), you’ll see that the school is right in the middle of the buffer. The radius of this buffer is 150 meters, and we find that there are four break-ins located within a block of the school

Figure 1: 150 meter buffer

Figure 1: 150 meter buffer

One operation we can do using these buffers is to keep just the break-ins located within the buffers. We do this by using the st_intersects() function, which we went through in Lab 4.

subset.int<-st_intersects(break.sf.utm, elem.buff)
subset.int.log = lengths(subset.int) > 0
elem.buff.break <- filter(break.sf.utm, subset.int.log)

The object elem.buff.break is a point sf object containing all break-ins located within 150 meters of an elementary school. One thing we can do with these data is map it.

tm_shape(sf.tracts.utm) +
  tm_polygons() +
tm_shape(break.sf.utm) +
  tm_dots(col = "black") +
tm_shape(elem.buff.break) +
  tm_dots(col="red") 

You can also make a static map and present it to the San Francisco Police Department to visually show how break-ins map onto elementary schools.

tmap_mode("plot")
tm_shape(sf.tracts.utm, unit = "mi") +
  tm_polygons() +
tm_shape(break.sf.utm) +
  tm_dots(col = "black") +
tm_shape(elem.buff.break) +
  tm_dots(col="red") +
tm_shape(elem.buff) +
  tm_polygons(col="red", alpha = 0.2) +
tm_scale_bar(breaks = c(0, 1, 2), size = 0.5) +
tm_compass(type = "4star", position = c("right", "top")) + 
tm_layout(main.title = "150 Meter Buffers around Elementary Schools, San Francisco,\nCar Break-ins, 2017",  main.title.size = 0.75, frame = FALSE)  

Do break-ins cluster around certain businesses?


A map is fine, but we’ll want to add some numerical summaries to get a better idea of where break-ins cluster. In this section, we’ll answer the question: do car break-ins cluster around certain types of businesses? We can do this by creating buffers of a certain distance around each business and counting the number of break-ins inside the buffer. The key variable in this data set is naicscodedescription, which provides a broad description of the business.

Let’s create 150 meter buffers around each business location

bus.buff <-st_buffer(bus.sf.utm, 150)

We will count the number of break-ins located within each buffer. We’ll do this for all businesses in our data set. Remember that bus.buff is a polygon. Counting the number of break-ins inside a buffer is a points-in-polygons operation, which we already did in Lab 6 using the aggregate() function. We’ll have to join the object we get from aggregate() back to the buffer.

temp<-aggregate(break.sf.utm["IncidntNum"], bus.buff, length) %>%
                  replace(is.na(.), 0)

bus.buff <- bus.buff %>% 
    st_join(temp, join=st_equals, left=FALSE) %>%
    rename(breakins = IncidntNum)

We can then use summarize() to find the mean number of car break-ins that occur within 150 meters of each business type.

bus.buff %>%
  group_by(naicscodedescription) %>%
  summarize(mean = mean(breakins, na.rm=TRUE))
## Simple feature collection with 11 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 542690 ymin: 4173435 xmax: 555993 ymax: 4185028
## epsg (SRID):    26910
## proj4string:    +proj=utm +zone=10 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
## # A tibble: 11 x 3
##    naicscodedescription     mean                                  geometry
##    <chr>                   <dbl>                        <MULTIPOLYGON [m]>
##  1 Accommodations           54.6 (((550272.8 4173685, 550272.6 4173677, 5…
##  2 Arts, Entertainment, a…  43.1 (((551854.1 4173684, 551853.9 4173676, 5…
##  3 Construction             25.5 (((545044.4 4174087, 545044.2 4174079, 5…
##  4 Financial Services       30.9 (((545044.4 4174087, 545044.2 4174079, 5…
##  5 Food Services            59.6 (((544379.4 4173834, 544379.2 4173826, 5…
##  6 Manufacturing            36.4 (((547184.2 4173820, 547184 4173812, 547…
##  7 Multiple                 44.9 (((553757.3 4173947, 553757.1 4173939, 5…
##  8 Retail Trade             59.0 (((554383.3 4173786, 554383.1 4173778, 5…
##  9 Transportation and War…  34.8 (((553974.2 4173947, 553974 4173939, 553…
## 10 Utilities                33.2 (((549072.5 4173789, 549072.3 4173781, 5…
## 11 Wholesale Trade          43.2 (((545218.2 4173894, 545218 4173887, 545…

It looks like food services (e.g. restaurants, bars) and retail trade (e.g. beauty parlors, gift shops, nail salons) lead the way, with 59.6 and 59.0 car break-ins occurring on average within 150 meters in 2017. The least is construction with 25.5.


Website created and maintained by Noli Brazil