Hi. I’m Sharon Machlis at IDG Communications, here with Episode 53 of Do More With R: Make an election map showing winners, losers, and margins of victory.
If you’re mapping election results of, say, the US presidential election by state, it can make sense to just show one color of red for states won by Republicans, and one color of blue for states won by Democrats. That’s because it doesn’t matter whether a candidate wins by a thousand votes or 3 million: It’s “winner take all.”
But when I map results of a state election by county, or a city-wide election by precinct, the margin does matter. It’s the overall total that decides the winner. Winning “Atlanta” itself isn’t all you need to know when looking at Georgia statewide results for governor. You’d want to know how many votes the Democrat won by, and compare that to other areas. You can see the difference.
That’s why I like to create maps color coded by winner and with intensity of color showing margin of victory. That tells you what areas contributed more and less to the overall result.
Let’s take a look. I’ll use Pennsylvania 2016 presidential results as an example.
I first load some packages: dplyr, glue, scales, htmltools, sf, and leaflet. I’ll also use rio to import the data CSV file. If you want to follow along, you can download the files at this video’s accompanying InfoWorld article
Next, I use sf’s st_read() function to import a shapefile of Pennsylvania counties. I don’t like that COUNTY_NAM column name, so I’ll change it to “County”.
Before I merge my data with my geography, I want to make sure that the county names are the same in both files. dplyr’s anti_join() function merges two data sets and shows which rows don’t have a match. Ah, McKean has a small c here. (I happen to know McKean County is all caps in the other file). I’ll change McKean to be all caps and check again. No problem rows now.
The next line of code merges the data with the geography Finally, I’m going to make sure that my new geography & data object uses the same projection as my leaflet tiles do. Projection is a pretty complex GIS topic. For now, just know that I need WGS84.
Now that my data is in the shape I need, I have three more tasks: Create color palettes for each candidate, create pop-ups for the map, and then code the map itself. I’ll start with the palettes.
I’m going to map raw vote differences here, but you might want to use percentage differences instead. This first line of code uses base R’s range() function to get the smallest and largest vote differences in the Margin column. I’ll want the lightest color to be the smallest number, and the darkest to be for the biggest number. As you can see, there’s a pretty big range.
Next I create two palettes, using the conventional red for Republicans and blue for Democrats. I’m using the same intensity scale for both palettes: lightest for the lowest margin, regardless of candidate; and highest for the highest margin, regardless of candidate. This will give me an idea of where each candidate was strongest on a single intensity scale. I use leaflet’s colorNumeric() function, with a palette color of reds or blues. The domain argument gives the minimum and maximum values for the color scale.
The next code group create two different data frames: One for each candidate, with only the places that candidate won in their data frame. Having two data sets helps me get fine control over the pop-ups and colors. I can even use different pop-up text for each.
Next task is those pop-ups. I’m generating some HTML code with the strong tag for bold text and br tags for line breaks. If you’re not familiar with glue, code inside the braces are variables that are evaluated. So, I’ve got the county name followed by the word COUNTY, then a line break. Then, I lead each pop-up with the winning candidate’s name, followed by their vote total; a line break and the other candidate’s name and vote total; and then the number of votes they won the county by. scales::comma() adds a comma to numeric vote totals of a thousand or more, and the accuracy = 1 makes sure it’s a round integer with no decimal points.
I pipe that glue() text string into htmltools’ HTML() function, which leaflet needs to display the popup text properly.
At last, the map. The code starts with creating a basic leaflet object without adding data as an argument in the main object. That’s because I’m going to be using two different data sets. Next line of code sets the background tiles to CartoDB Positron. (That’s optional, you can use the default, but I like that style). I then use leaflet’s addPolygons() function two times, one for each candidate’s data frame overlaid on the same map layer.
I set the data for each addPolygons() to be that candidate’s data frame. The fillColor argument takes each candidate’s palette and applies it to their margin of victory. The popup – actually a rollover label – will be that candidate’s HTML which I created before.
The rest is standard design. Stroke sets a border line around each polygon, smoothFactor is how much to simplify the polygon outline display (I copied the value from an RStudio demo map I liked), fill opacity is what you’d expect. Color is the color of the polygon border line, not the polygon itself (the polygon color was set with fillColor). Weight is the thickness of the polygon border line in pixels.
And there’s the map! Philadelphia is at the bottom right. You can see just how important it is, population-wise, compared to all other areas of Pennsylvania that are large on the map but have way fewer people.
What might be interesting to map this year is the difference in raw vote margins between 2016 and 2020. That map would show where patterns shifted the most.
That’s it for this episode, thanks for watching! For more R tips, head to the Do More With R page at bit-dot-l-y slash do more with R, all lowercase except for the R.
You can also find the Do More With R playlist on the YouTube IDG Tech Talk channel — where you can subscribe so you never miss an episode. Hope to see you next time. Stay healthy and safe, everyone!