Mapping Environmental Exposures

Mapping Environmental Exposures

This is a step-by-step code lab that allows you to create maps using publicly available data as well as data about your own subjects, when you know their census tracts (although the concept could be extended to any geographic areas, really, as long as you can find a geoJSON file that describes the geographic boundaries).

The goal is to give us better insights about how geographic location, and the things that come along with location (like neighborhood violence, proximity to air pollution, exposure to pathogen-carrying ticks, etc.) affects health. Below, I take real data from the City of Philadelphia and map it along with some fabricated research data, to show you how this might look for your own research.

Want to run this yourself? Feel free to download the complete code

Purposes

Our research subjects and patients don’t exist in a vacuum – they have environmental exposures to things that promote health (like safe, walkable neighborhoods) and reduce health (like lead exposure or gun violence). There are lots of free, publicly available datasets about environmental exposures in various geographic locations (like Philadelphia specifically, the nation as a whole, etc.) In this document, we’re going to demonstrate how to create a map that shows the number of shootings in Philadelphia in the last few years by census tract, and combine that data with information about CHOP’s patients.

Why might you find this useful? Because perhaps you want to look at how your patients are responding to an exercise intervention for obesity and see if there’s a difference in compliance and weight loss success that could be related to violence near home that make it unsafe to play outside. Or you might want to map lead exposure and rates of ADHD or ER admissions from impulsive behavior. Perhaps there’s a difference in surgical complications related to poverty or food insecurity. Whether you choose to display your findings in a map or simply make inferences based on the data, understanding geographical data like pollution measurements, rates of poverty, areas of high traffic, etc., will help you understand your patients or subjects in an aggregated way, even if you don’t have specific data on their individual experiences.

If you want to run this code, everything you need is right here. You don’t have to use your own data, because I’m going to create fake research data to show you how to work with this kind of data. If you’re new to R, this may seem a bit overwhelming, but if you’ve had a bit of practice with data reshaping and the use of ggplot, you’ll be able to understand each step pretty well.

Preliminaries

You’ll want to have a few packages installed. You may already have these, but if not, run the following:

install.packages("dplyr")
install.packages("leaflet")
install.packages("jsonlite")
install.packages("ggplot2")
install.packages("maptools")
install.packages("sp")
install.packages("rgdal")
install.packages("scales")

Obtain Geographic Data

The City of Philadelphia supplies information about shootings (including officer-involved shootings) which includes data about the shooting victim and the location. Here, we’re really interested in the location of shootings over the past few years, to understand what parts of Philadelphia are more prone to this specific kind of violence.

To see more information about this dataset, please visit https://www.opendataphilly.org/dataset/shooting-victims/resource/a6240077-cbc7-46fb-b554-39417be606ee?inner_span=True.

For our purposes, we’re going to get the bare minimum of information: latitude, longitude, and shooting ID. The API endpoint is described in the link above and uses a SQL query to select only the data we care about. Because our query has spaces and other special characters, we need to “encode” it for request.

The data will come in as json (a file format that has any number of potentially nested key-value pairs).

library(jsonlite)
url <- URLencode('https://www.opendataphilly.org/api/action/datastore_search_sql?sql=SELECT _id, lat, lng from "a6240077-cbc7-46fb-b554-39417be606ee"')
shooting_data <- fromJSON(url)

We can examine the shooting data by using R’s str (structure) command:

str(shooting_data)
## List of 3
##  $ help   : chr "Execute SQL queries on the DataStore.\n\n    The datastore_search_sql action allows a user to search data in a "| __truncated__
##  $ success: logi TRUE
##  $ result :List of 3
##   ..$ records:'data.frame':  2500 obs. of  3 variables:
##   .. ..$ lat: chr [1:2500] "39.922968633" "40.0296637640001" "40.037804839" "39.9682945860001" ...
##   .. ..$ lng: chr [1:2500] "-75.18870837" "-75.1205859089999" "-75.139286339" "-75.223324033" ...
##   .. ..$ _id: int [1:2500] 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ fields :'data.frame':  3 obs. of  2 variables:
##   .. ..$ type: chr [1:3] "int4" "numeric" "numeric"
##   .. ..$ id  : chr [1:3] "_id" "lat" "lng"
##   ..$ sql    : chr "SELECT _id, lat, lng from \"a6240077-cbc7-46fb-b554-39417be606ee\""

Here we see that we have a nested data frame, accessible at shooting_data$result$records:

head(shooting_data$result$records, 6)
##                lat               lng _id
## 1     39.922968633      -75.18870837   1
## 2 40.0296637640001 -75.1205859089999   2
## 3     40.037804839     -75.139286339   3
## 4 39.9682945860001     -75.223324033   4
## 5     39.995426509     -75.126076954   5
## 6 40.0429494850001 -75.1617552229999   6

Mapping Points

If we wanted to, we could easily create a map of these shootings, just based on latitude and longitude. Since latitude and longitude are currently in “chr” (character) format, we’ll make them numeric so that we can do math on them. We’ll create a map that’s centered on the mean latitude and longitude of all our shootings, and which is appropriately zoomed in (you might have to experiment with the zoom factor).

The result is a clickable, zoomable map. Leaflet is a great way to do quick visualization of geographic data. We’ll come back to Leaflet again at the end, but for now we’ll just take a quick look.

library(leaflet)
library(dplyr)

shootings <- shooting_data$result$records
shootings$lat <- as.numeric(shootings$lat)
shootings$lng <- as.numeric(shootings$lng)

shootings %>% 
  leaflet() %>% 
  addTiles() %>%
  setView(lng = mean(as.numeric(shootings$lng), na.rm=TRUE), 
          lat = mean(as.numeric(shootings$lat), na.rm=TRUE), zoom = 10) %>%
  addMarkers(clusterOptions = markerClusterOptions())

Go ahead, click on the clusters to zoom in!

Mapping Polygons

What’s more likely, however, is that we want to use polygon data to create a map that shows how much a particular area is affected. This is because we want to create combined data – we want to put information about our patients or research subjects along with the level of violence they are exposed to. Instead of using latitude and longitude, we’ll gather the number of shootings per Census tract, which we can then use as a proxy for violence exposure for the patients and subjects who live in that Census tract. It’s a sort of “binning”, but using the existing “bins” of Census tracts.

How do we obtain the Census tract for each shooting, using latitude and longitude? We can use a Census tract “shapefile” or a “geojson” file. These are both standard data formats that describe a map by listing polygons, each vertex of which is a lat/long coordinate. Using one of these kinds of files, we can map our locations (in our case, shootings) to the correct polygon.

We get the geojson file from Philadelphia’s Open Data website. This file not only contains the polygon definitions, but also some information about each polygon, like the Census tract name, what state the tract belongs to, several identifiers for that tract, the geographic center of the tract, and so on.

library(rgdal)
philadelphiaCensusTracts <- readOGR("http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson")
## OGR data source with driver: GeoJSON 
## Source: "http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson", layer: "OGRGeoJSON"
## with 384 features
## It has 14 fields
philadelphiaCensusBlocks <- readOGR("http://data.phl.opendata.arcgis.com/datasets/e9e2e152bc1644e2af84927a8f4c3c06_0.geojson")
## OGR data source with driver: GeoJSON 
## Source: "http://data.phl.opendata.arcgis.com/datasets/e9e2e152bc1644e2af84927a8f4c3c06_0.geojson", layer: "OGRGeoJSON"
## with 18872 features
## It has 18 fields

Let’s take a peek at the data that comes along with our geojson!

head(philadelphiaCensusTracts@data)
##   OBJECTID STATEFP10 COUNTYFP10 TRACTCE10     GEOID10 NAME10
## 0        1        42        101    009400 42101009400     94
## 1        2        42        101    009500 42101009500     95
## 2        3        42        101    009600 42101009600     96
## 3        4        42        101    013800 42101013800    138
## 4        5        42        101    013900 42101013900    139
## 5        6        42        101    014000 42101014000    140
##         NAMELSAD10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10  INTPTLAT10
## 0  Census Tract 94   G5020          S  366717        0 +39.9632709
## 1  Census Tract 95   G5020          S  319070        0 +39.9658709
## 2  Census Tract 96   G5020          S  405273        0 +39.9655396
## 3 Census Tract 138   G5020          S  341256        0 +39.9764504
## 4 Census Tract 139   G5020          S  562934        0 +39.9750563
## 5 Census Tract 140   G5020          S  439802        0 +39.9735358
##     INTPTLON10 LOGRECNO
## 0 -075.2322437    10429
## 1 -075.2379140    10430
## 2 -075.2435075    10431
## 3 -075.1771771    10468
## 4 -075.1711846    10469
## 5 -075.1630966    10470

Option 1: Static Maps

We can specify Census tract by various methods – by the full code number GEOID10, by the short code NAME10, or by the human-friendly name NAMELSAD10. For now, I’ll use the Census short code.

I’ll get that complex data type, then make a long data frame of its geography which I’ll call philadelphiaCensusTracts_fortified, which will set a “group” (region) equal to the NAME10 aspect. This allows me to easily join data to it later on.

library(broom)
philadelphiaCensusTracts_fortified <- tidy(philadelphiaCensusTracts, region = "NAME10")

What does this geojson data “look” like, when we plot the various polygons it includes? Each polygon represents a Census tract from the 2010 Census.

library(ggplot2)
library(ggmap)
philly_plain <- ggplot() + 
  geom_polygon(data=philadelphiaCensusTracts_fortified, 
               aes(x=long, y=lat, group=group, fill=NA), 
               color = "black", fill=NA, size=0.1) +
  coord_map() + 
  theme_nothing()
print(philly_plain)

We could visually overlay shootings on top of the census tracts, which at least helps us see which census tracts might be the most affected. This doesn’t help us calculate any statistics, but at least gives us an intuition about our data:

philly_enhanced <- ggplot() + 
  geom_polygon(data=philadelphiaCensusTracts_fortified, 
               aes(x=long, y=lat, group=group, fill=NA), 
               color = "black", fill=NA, size=0.1) +
  geom_point(data=shootings, aes(x=lng, y=lat, color="red", shape=".", alpha=0.5)) + 
  coord_map() + 
  theme_nothing()
print(philly_enhanced)

From this map, we see that some tracts have no shootings at all, and others have many. We need to remember that not every Census tract has shooting data, and we presume that means that there were 0 shootings, so we will need to add that in, if we want to show data for all Census tracts.

Mapping Point Data to Polygons

Now what we’d like to do is get the shootings-per-tract data, which we can then combine with our research or clinical data to see if violence near home has any effect on our outcomes. To do this, we take the latitude and longitude of our shootings and transform them slightly so that they are understood as spatial coordinates, not just pairs of numbers. We’ll use the same map projection used in our original philadelphiaCensusTracts.

library(sp)
coordinates <- SpatialPoints(shootings[c("lng", "lat")])
proj4string(coordinates) <- proj4string(philadelphiaCensusTracts)

Let’s now apply what we know about our polygons (from philadelphiaCensusTracts) and apply that to our points. We’ll end up with a table that has one row for each shooting coordinate. Essentially, what we’re doing is taking each point, lining it up with a matching polygon, and then getting the data about that polygon, which came along with the geoJSON file we downloaded. We don’t want our NAME10 to be a factor variable, but a character field (who knows if next Census we might have something like 176.02.A, which is why we don’t use numeric).

shooting_tract_data <- over(coordinates, philadelphiaCensusTracts)
shooting_tract_data$NAME10 <- shooting_tract_data$NAME10
head(shooting_tract_data)
##   OBJECTID STATEFP10 COUNTYFP10 TRACTCE10     GEOID10 NAME10
## 1      198        42        101    003600 42101003600     36
## 2      112        42        101    029000 42101029000    290
## 3      356        42        101    028200 42101028200    282
## 4       57        42        101    010300 42101010300    103
## 5      141        42        101    017602 42101017602 176.02
## 6       98        42        101    024700 42101024700    247
##            NAMELSAD10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10  INTPTLAT10
## 1     Census Tract 36   G5020          S  964539        0 +39.9279255
## 2    Census Tract 290   G5020          S  818133    20001 +40.0296096
## 3    Census Tract 282   G5020          S  855812        0 +40.0348704
## 4    Census Tract 103   G5020          S  281357        0 +39.9678322
## 5 Census Tract 176.02   G5020          S  400458        0 +39.9967182
## 6    Census Tract 247   G5020          S  941266        0 +40.0417034
##     INTPTLON10 LOGRECNO
## 1 -075.1920206    10377
## 2 -075.1166058    10605
## 3 -075.1403327    10596
## 4 -075.2254383    10437
## 5 -075.1305528    10507
## 6 -075.1645311    10560

We see the first few lines of the Census data for each of our shootings. For example, the first shooting in our shooting data corresponds to Census tract 36, which is in State 42 (Pennsylvania) and County 101 (Philadelphia County). We can use this to find out how many shootings take place in each Census tract.

shootings_by_census_shortname <- shooting_tract_data %>% 
                                 group_by(NAME10) %>% 
                                 summarise(num_shootings = n()) %>% 
                                 ungroup() 
head(shootings_by_census_shortname)
## # A tibble: 6 x 2
##   NAME10 num_shootings
##   <fct>          <int>
## 1 1                  2
## 2 10.02              1
## 3 100                3
## 4 101                9
## 5 102                9
## 6 103               14

Handling Empty Data

Don’t forget that there are some Census tracts that aren’t represented at all in our shooting_tract_data data frame, so let’s make sure we enrich it with all the tracts that aren’t included in the shooting data. We can get those by taking the data frame of our tract data, selecting the list of all the Census tracts in Philadelphia, and making sure that if they weren’t mentioned above, we add them, but with num_shootings equal to 0.

non_shooting_tracts <- philadelphiaCensusTracts@data %>% 
                       select(NAME10) %>%
                       filter(!NAME10 %in% shootings_by_census_shortname$NAME10) %>%
                       mutate(num_shootings = 0)
head(non_shooting_tracts)
##   NAME10 num_shootings
## 1    143             0
## 2    206             0
## 3     14             0
## 4     19             0
## 5     29             0
## 6     50             0

We can now combine the tracts-with-shootings and the tracts-with-no-shootings to get an overall picture of violence by census tract:

shooting_by_tract <- rbind(shootings_by_census_shortname, non_shooting_tracts)

Aside: Data Fabrication

Let’s say we have some data about how much exercise children get per day. Note that I am going to completely fabricate this data below and fabricate a finding in order to have an interesting data product. You will have your own data to include, obviously, which isn’t made up. So if you don’t understand the next code chunk, don’t worry – I’m just making fake data.

set.seed(123)
shootings_q1 <- subset(shooting_by_tract, 
                       num_shootings <= quantile(num_shootings, 0.25))
shootings_q2 <- subset(shooting_by_tract, 
                       num_shootings > quantile(num_shootings, 0.25) & 
                         num_shootings <= quantile(num_shootings, 0.5))
shootings_q3 <- subset(shooting_by_tract, 
                       num_shootings > quantile(num_shootings, 0.5) & 
                         num_shootings <= quantile(num_shootings, 0.75))
shootings_q4 <- subset(shooting_by_tract, 
                       num_shootings > quantile(num_shootings, 0.75))

fake_mrns <- sample(c(1234567:9999999), 500, replace=FALSE)
fake_census_tract <- sample(shooting_by_tract$NAME10, 500, replace=TRUE)  
# this isn't really true, we don't have equal representation among our patients in each tract!

fake_exercise_data <- data.frame(mrn = fake_mrns, census_tract = fake_census_tract, 
                                 daily_exercise_minutes = NA_integer_, stringsAsFactors = FALSE)
fake_exercise_data$daily_exercise_minutes[which(fake_exercise_data$census_tract %in% shootings_q1$NAME10)] <- 
  sample(c(45:120), nrow(shootings_q1), replace = TRUE)
fake_exercise_data$daily_exercise_minutes[which(fake_exercise_data$census_tract %in% shootings_q2$NAME10)] <- 
  sample(c(35:90), nrow(shootings_q2), replace = TRUE)
fake_exercise_data$daily_exercise_minutes[which(fake_exercise_data$census_tract %in% shootings_q3$NAME10)] <- 
  sample(c(25:60), nrow(shootings_q3), replace = TRUE)
fake_exercise_data$daily_exercise_minutes[which(fake_exercise_data$census_tract %in% shootings_q4$NAME10)] <- 
  sample(c(15:30), nrow(shootings_q4), replace = TRUE)
fake_exercise_data$daily_exercise_minutes[which(is.na(fake_exercise_data$daily_exercise_minutes))] <- 
  sample(c(15:150), nrow(shootings_q4), replace = TRUE)
# add some noise!
fake_exercise_data$daily_exercise_minutes <- 
  round(fake_exercise_data$daily_exercise_minutes*sample(seq(.7,1.3, by=0.1), 
                                                         nrow(fake_exercise_data), replace = TRUE))

Let’s take a peek at our fake data:

head(fake_exercise_data)
##       mrn census_tract daily_exercise_minutes
## 1 3755308       274.02                     17
## 2 8144402       279.01                     25
## 3 4819425          247                     32
## 4 8974594          141                     32
## 5 9478166       279.01                     73
## 6 1633889       177.01                     31

We now have two data frames that interest us:

  • One has a row per subject, and includes their Census tract and exercise amount
  • One has a row per Census tract, and includes the number of shootings there

Adding Clinical/Research Data

Let’s combine them and take a quick look:

exercise_and_violence <- merge(x=fake_exercise_data, y = shooting_by_tract, 
                               by.x = "census_tract", by.y = "NAME10", all = TRUE)
head(exercise_and_violence)
##   census_tract     mrn daily_exercise_minutes num_shootings
## 1            1 8276686                     27             2
## 2        10.01      NA                     NA             0
## 3        10.02 4012634                     41             1
## 4        10.02 4819049                     86             1
## 5        10.02 6644697                     77             1
## 6          100 4318947                     40             3

Do we see any trends? Let’s do a quick plot.

ggplot(exercise_and_violence, aes(x=num_shootings, y=daily_exercise_minutes)) +
  geom_point()

Wow, it looks like there’s a trend here! (Which makes sense, since I baked it into my fake data…)

At this point, statistically, we could do a two-sample test on the top and bottom quartile of our subjects to see if near-home violence leads to decreased exercise, or create a predictive model using regression.

We could also create a map – something that is easily understandable for policymakers and philanathropists to understand, or something you could put into a publication or on your website. Since in this case you’re talking about statisics per Census tract, you’ll want to simplify your data and find something like the mean or median amount of exercise per tract. Let’s do that now:

Aggregating Clinical/Research Data

exercise_per_tract <- exercise_and_violence %>% 
                      group_by(census_tract) %>%
                      summarise(mean_exercise = mean(daily_exercise_minutes)) %>%
                      ungroup()
head(exercise_per_tract)
## # A tibble: 6 x 2
##   census_tract mean_exercise
##   <fct>                <dbl>
## 1 1                     27.0
## 2 10.01                 NA  
## 3 10.02                 68.0
## 4 100                   29.5
## 5 101                   66.0
## 6 102                   NA

And we can combine our exercise by tract with our shootings by tract, for an almost-map-ready dataset:

exercise_shootings_per_tract <- merge(x=exercise_per_tract, y=shooting_by_tract, 
                                      by.x="census_tract", by.y="NAME10",
                                      all = TRUE)
head(exercise_shootings_per_tract)
##   census_tract mean_exercise num_shootings
## 1            1          27.0             2
## 2        10.01            NA             0
## 3        10.02          68.0             1
## 4          100          29.5             3
## 5          101          66.0             9
## 6          102            NA             9

Create Static Maps

Why do I say almost-map-ready? Because we have to combine our exercise and shooting data back into the data that includes our polygons. We can again do this using merge!

final_data <- merge(x = philadelphiaCensusTracts_fortified, y = exercise_shootings_per_tract, by.x = "id", by.y="census_tract", all=TRUE)

You’ll have to be creative about how to display the data in a map, because you want to show two different things: exercise amount (which we might not have for every single tract), and number of shootings (which we do have for each tract). If you’re using static maps (non-interactive images for a poster or publication), you could consider doing side-by-side maps like this:

library(scales)
philly_violence_map <- ggplot() +  
  geom_polygon(data = final_data, aes(x=long, y=lat, group=group, fill=num_shootings),
               color= "black", size = 0.1)  + 
  coord_map() +
  scale_fill_distiller(type="seq", trans="reverse", palette = "Reds", breaks=pretty_breaks(n=5))  +
  theme_nothing(legend=TRUE) +
  labs(title="Shootings in Philadelphia, 2015-2018", fill="")
philly_violence_map


philly_exercise_map <- ggplot() +  
  geom_polygon(data = final_data, aes(x=long, y=lat, group=group, fill=mean_exercise),
               color= "black", size = 0.1)  + 
  coord_map() +
  scale_fill_distiller(type="seq", trans="reverse", palette = "Blues", breaks=pretty_breaks(n=5))  +
  theme_nothing(legend=TRUE) +
  labs(title="Exercise in Average Minutes Per Day by Subjects", fill="")
philly_exercise_map

Looking at the two side by side is a bit hard, right? There’s lots of grey in our exercise map, after all… it’s hard to tell what’s going on and if any pattern exists.

What if we used fill to show exercise amount and then had the border color different for high-violence sectors? We’ll map the areas with 10 or more shootings as a separate layer on top of the rest.

high_shooting_areas <- subset(final_data, num_shootings >=10)

philly_combined_map <- ggplot() +  
  geom_polygon(data = final_data, aes(x=long, y=lat, group=group, fill=mean_exercise),
               color = "black", size=0.2 )  + 
  coord_map() +
  geom_polygon(data = high_shooting_areas, aes(x=long, y=lat, fill=mean_exercise, group=group),
               color = "red", size=0.4 )  + 
  scale_fill_distiller(type="seq", trans="reverse", palette = "Blues", breaks=pretty_breaks(n=5))  +

  theme_nothing(legend=TRUE) +
  labs(title="Exercise in Average Minutes Per Day by Subjects,\nAreas with 10+ Shootings Outlined in Red", fill="")
philly_combined_map

This way we can see the effect of shootings on exercise at a glance: the areas with more shootings have a lighter color of blue, and those areas are often lined with red. Still, we can’t see the total number of shootings – the gradations of color to do that would be hard to detect, so we opted to use a cutoff for high violence areas.

Alternatively, we could create an interactive map. This would be more appropriate for a web page or presentation, as it provides data as-needed, when you move your mouse over an area.

Option 2: Interactive Maps

There are many ways to create interactive maps. D3, a json library, allows you do do some powerful data visualizations like my map of Philadelphia overdue property tax on the part of real estate investors. Still, to do this, you need to be able to write some sophisticated javascript, which can be a steep learning curve. D3 is worth learning if you are interested in interactive data visualizations and have the time – just check out this gallery! For our purposes here, however, let’s stick to something a bit simpler – using Leaflet again in R.

Regardless of which approach you take, you’ll need to enrich your geojson object. For ggplot, we simply took the base information out of the geojson object, enriched it outside of that object, and used an enriched version of the data without adding it back in to the geojson. Let’s rectify that, since we want the geojson object to now hold all of our information – Census tracts (which is what it already has), shooting data, and the exercise data from our study. We’ll go back a bit, so that you won’t have had to do the static maps in order to get to the interactive map.

First, we’ll combine exercise_shootings_per_tract with the data found in philadelphiaCensusTracts@data:

census_tracts <- philadelphiaCensusTracts@data %>% mutate(NAME10 = as.character(NAME10))
census_tracts <- merge(x=census_tracts, y=exercise_shootings_per_tract, by.x="NAME10", by.y="census_tract")

Then we’ll add our enriched data back to the geojson data, so that in addition to the fields it came with, it will now contain the exercise and shooting data we gathered. It’s important to order this data by the OBJECTID so that the correct polygon is associated with the correct data!

philadelphiaCensusTracts@data <- census_tracts[order(census_tracts$OBJECTID),]

Now, let’s create an interactive map! We’ll color the polygons by exercise amount to begin with.

exercise_palette <- colorBin("Blues", domain = philadelphiaCensusTracts$mean_exercise, bins = 5, na.color = "#808080")

interactive_map <- leaflet(philadelphiaCensusTracts) %>%
  setView(lng = mean(as.numeric(shootings$lng), na.rm=TRUE), 
          lat = mean(as.numeric(shootings$lat), na.rm=TRUE), zoom = 11) %>%
  addPolygons(
    data = philadelphiaCensusTracts,
    fillColor = ~exercise_palette(philadelphiaCensusTracts@data$mean_exercise),
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "white", # border color
    fillOpacity = 1)
interactive_map

Now, let’s add some labels. We’ll do variable interpolation to create labels that tell what each Census tract is and the exercise and shooting data for that tract:

labels <- sprintf(
  "<strong>%s</strong><br/>
  Exercise in Minutes: %g <br/>
  Number of Shootings: %g",
  philadelphiaCensusTracts$NAMELSAD10, 
  philadelphiaCensusTracts$mean_exercise,
  philadelphiaCensusTracts$num_shootings
) %>% lapply(htmltools::HTML)

Then we’ll create the map again, but with labels. This allows the viewer to see at a glance the violence and exercise metrics for each tract!

interactive_map <- leaflet(philadelphiaCensusTracts) %>%
  setView(lng = mean(as.numeric(shootings$lng), na.rm=TRUE), 
          lat = mean(as.numeric(shootings$lat), na.rm=TRUE), zoom = 11) %>%
  addPolygons(
    fillColor = ~exercise_palette(mean_exercise),
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "white", # border color
    fillOpacity = 1,
  label = labels,
  labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"))
interactive_map

What if we also want to show some color values for shootings, like we did above in our static maps? We’ll layer: our first layer will be a map that has all white borders and our second layer will have red borders, but they’ll only be visible for the shapes for which shootings are more than 10.

border_opacity <- as.numeric(philadelphiaCensusTracts$num_shootings >= 10)
interactive_map <- leaflet(philadelphiaCensusTracts) %>%
  setView(lng = mean(as.numeric(shootings$lng), na.rm=TRUE), 
          lat = mean(as.numeric(shootings$lat), na.rm=TRUE), zoom = 11) %>%
  addPolygons(
    fillColor = ~exercise_palette(mean_exercise),
    weight = 1,  # border thickness
    color = "white",
    fillOpacity = 1)  %>%
  addPolygons(
    fillOpacity = 0,
    color = "red",
    opacity = border_opacity,
    weight = 2,
    label = labels,
  labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto")
  )
interactive_map