R and Leaflet to present a particularly nasty example of gerrymandering: our beloved District Four in Illinois.

" />
Article Image
read

I’ll use R and Leaflet to present a particularly nasty example of gerrymandering: our beloved District Four in Illinois.

According to Wikipedia, “it was created to pack two majority Hispanic parts of Chicago into one district”, so let’s use the Census data to check how if the district is really following the ethnic borders.

If you’re unsure of what gerrymandering is or not convinced that it’s a big issue for democracy, then please read “the best explanation of gerrymandering you will ever see”

To produce a map, we need to prepare a geo dataset following these steps

  1. Get the census tract polygons
  2. Get the Census Data
  3. Merge census data with the polygons
  4. Get the polygons for District Four
  5. Use leaflet for R to plot
  6. (optional) Export to .geojson for Leaflet

Get the census tract polygons

Thanks to the tigris package, this is incredibly straightforward:

library(sp)
library(tigris)
# Counties that overlap with district 4
county_list = c("Cook", "DuPage", "Will County")
ill_tracts <- tracts(state = 'IL', 
                     county = county_list)

By default, these polygons will be from 2015.

Get the Census Data

I’ll use the percentage of hispanics in a Census tract. We need, by tract, a measure of the total population and one for the hispanic population.

R has a very convenient package called acs to download specific variables from different geographies. In this case I will download all tracts from our county_list

(You will need to get your API key from the Census, but it takes about 40 seconds to do it!)

library(acs)
# This only need to be done once!
api.key.install("your_key_here")

pop_hisp = acs.fetch(geo=geo.make(state="IL",
                county=county_list, tract="*"),
                endyear = 2010, dataset="sf1",
                variable=c("P0010001", "P0040003"))

sf1 and endyear=2010 means we are downloading the Census data for 2010. P0010001 contains the total population in a census tract, while P0040003 has hispanic population in that tract.

pop_hisp is an acs package object, which contains the estimates for our variables, plus the name of the State, County and Census tract (under geography).

Click here for the definition of all variables of the short form of the Census

Merge Census data with the polygons

We need to prepare pop_hisp a bit for the merge

col1 = pop_hisp@geography$NAME
col2 = (pop_hisp@estimate[,"P0040003"] 
           / pop_hisp@estimate[,"P0010001"])
hispanic_df <- data.frame(col1, col2)
colnames(hispanic_df) <- c("NAMELSAD", "perc_hispanic")

hispanic_df is a data.frame with a percentage hispanic column and a NAMELSAD column that contains the Census Tract number. We’ll use this key to merge with the polygons contained in ill_tracts.

hispanic_merged = geo_join(ill_tracts, hispanic_df,
                           "NAMELSAD", "NAMELSAD")

hispanic_merge is a SpatialPolygonsDataFrame that is pretty much a data.frame augmented with a geometry of polygons. I haven’t used this package much, but the idea seems very similar to geopandas, which has taken extended pandas dataframes and extended (subclassing) them to include geometry.

Get the polygons for District Four

Once again, tigris comes to the rescue

cds = congressional_districts()

#Keep District Four from Illinois
ill_code = "17"
distr_four = cds[(cds@data$STATEFP == ill_code) 
                 & (cds@data$NAMELSAD =="Congressional District 4") , ]

Leaflet Time

If you haven’t heard of it, Leaflet is a javascript library that makes it easy to produce stunning interactive maps.

R has a library by the same name to interact with the javascript library without leaving RStudio. This doesn’t expose all of Leaflet, but from the little I’ve seen it provides a lot of the basic functionality. Of course, if you want more flexibility you’ll have to use the javascript library directly.

Let’s make a choropleth plot with the leaflet R package to see if we’re on the right track

library(leaflet)

pal <- colorQuantile("Greens", NULL, n=5)

popup <- paste0("<span style='color:#7f0000'><strong>
                Percentage Hispanic: </strong></span>", 
                as.character(round(hispanic_merged$perc_hispanic,
                                   digit=3)))

leaflet() %>%
    addProviderTiles("CartoDB.Positron") %>%
    addPolygons(data = hispanic_merged, 
                fillColor = ~pal(hispanic_merged$perc_hispanic), 
                fillOpacity = 0.7, 
                weight = 0.2, 
                smoothFactor = 0.2, 
                popup = popup) %>%
    addPolygons(data = distr_four, 
                fillOpacity = 0.,
                stroke=TRUE, opacity=1,
                weight = 1.5, color="black",
                smoothFactor = 0.5, group="distr_four") %>%
    addLegend(pal = pal, 
              values = hispanic_merged$perc_hispanic, 
              position = "bottomright", 
              title = "Percentage Hispanic") %>%
    addLayersControl(
        overlayGroups = c("distr_four"),
        options = layersControlOptions(collapsed = FALSE)
    )

Static Version of the map

Click here for the interactive map

As you can see, the odd shape manages to merge two very large groups of hispanics and give them just one representative.

Export everything to GeoJson

You might want to work directly with the Leaflet javascript library, since it provides with more options and possibilities.

Leaflet read .geojson files, so we need to export the SpatialPolygonsDataFrame to .geojson.

This is pretty straightforward, but I ran into some issues with writeOGR complaining the file already existed, so I had to add a if exists: delete line.

library("rgdal")

# Export hispanic_merged
geojson_file = "hispanic_merged.geojson"
if (file.exists(geojson_file)) file.remove(geojson_file)

writeOGR(hispanic_merged, geojson_file,
        "hispanic_merged", driver="GeoJSON",
        check_exists=FALSE, overwrite_layer=TRUE)
        
# Export district four
distr_four_file = "distr_four.geojson"
if (file.exists(district_four_file)) file.remove(district_four_file)

writeOGR(dist_four, distr_four_file,
        "distr_four", driver="GeoJSON",
        check_exists=FALSE, overwrite_layer=TRUE)
Blog Logo

Cristian Dagnino J.


Published

Image

Incentive Compatible

A journal about my adventures on the internet and the amazing things I found there -- Random comments on economics, data analysis and graphs with nice colors.

Back to Overview