Mapping
Introduction to Map Making in R
Discussion on the basics of creating maps in R.
R as a powerful map-making tool:
Future courses will explore sophisticated map-making techniques.
Focus on using two main map-making packages:
maps.
mapdata.
Incorporating tidyverse tools:
Primarily ggplot for displaying maps.
Two main topics of discussion:
Basics of utilizing R’s built-in map data.
Creating choropleth maps to visualize data variations across geographic units.
Drawing a Map of the United States
Initial goal: draw a map of the United States.
Steps taken:
Load necessary libraries (ensure they are installed first).
Code:
R library(maps) library(mapdata)
Understanding maps:
Definition: A map consists of connected points delineating boundaries of geographic entities (counties, states, countries).
Representation: Boundaries are represented by sets of longitude and latitude coordinates.
Requirement: Latitude and longitude of U.S. coastline points to draw the map.
Data retrieval for the U.S.:
Code:
R usa <- map_data("usa") head(usa)
Output of
head(usa):Variables in the dataset:
long: longitude of a boundary point
lat: latitude of a boundary point
group: argument for ggplot indicating how to connect points (same group points connect, different group points are disconnected).
order: order indication for drawing points.
region: region/category of the data (e.g., state).
subregion: sub-region/category of the data (e.g., county).
Plotting the Map
Next step: actual plot creation.
Code to draw the map:
```R
ggplot() +
geompolygon(data = usa, aes(x = long, y = lat, group = group)) + coordquickmap()
- Explanation of components:
- `ggplot()` sets up the plotting environment.
- `geom_polygon()` uses geometric shapes to create complex polygons representing geographical boundaries.
- `coord_quickmap()` specifies the use of the Mercator projection.
- Explanation of projections:
- Necessary because Earth is round, and maps are flat.
- Mercator projection is commonly used.
- Alternatives: `coord_map()` from **mapproj** library can be used for different projections.
# Customization Options
- Default behavior of `geom_polygon()` fills polygons with black.
- **Code to change fill to transparent:**
R
ggplot() +
geompolygon(data = usa, aes(x = long, y = lat, group = group), fill=NA, color="black") + coordquickmap()
- Blank map scenario explanation:
- If only `fill=NA` is used without setting `color`, the map remains unfilled & no boundary line is drawn.
- Importance of setting `color` to visualize boundaries.
# Modifying the Background
- **Removing background lines:** The background (latitude/longitude lines) can be removed using:
R
ggplot() +
geompolygon(data = usa, aes(x = long, y = lat, group = group)) + coordquickmap() +
theme_void()
- Importance of aesthetic choices in map-making.
- Additional features can be added to adjust titles or other stylistic elements.
# Exploring Other Maps
- Besides the U.S., maps for various countries can also be made using the maps package.
- Example: Drawing a map of France.
- **Code to create a map of France:**
R
France <- mapdata("france") ggplot() + geompolygon(data = France, aes(x = long, y = lat, group = group)) +
coord_quickmap()
- To obtain a list of countries available in the package:
R
options(max.print=2000)
maps::map("world", namesonly=TRUE, plot=FALSE)
# Drawing Maps of US States and Counties
- The package also supports state and county maps.
- **Code to draw a state map:**
R
states <- mapdata("state") ggplot(data = states) + geompolygon(aes(x = long, y = lat, group = group)) +
coord_quickmap()
- Adjusting color differentiation of states for better visibility:
R
ggplot(data = states) +
geompolygon(aes(x = long, y = lat, group = group), col="white", lwd=0.15) + coordquickmap()
- **Aesthetic adjustments:**
- Emphasizes state boundaries for improved visualization.
- Adjusts line width with `lwd`.
- Challenges in including Alaska and Hawaii in the map visualization.
# Subset Maps
- Drawing maps for a specific subset of states (e.g., West Coast states).
- **Code for Pacific Coast states:**
R
westcoast <- filter(states, region %in% c("california","oregon","washington")) ggplot(data = westcoast) +
geompolygon(aes(x = long, y = lat, group = group)) + coordquickmap()
- Utilizes `filter` from dplyr to subset states by their region variables.
# County-Level Mapping
- Code to draw a map of counties:
R
counties <- mapdata("county") ggplot(data = counties) + geompolygon(aes(x = long, y = lat, group = group)) +
coord_quickmap()
- To visualize state boundaries alongside counties:
R
ggplot() +
geompolygon(data=counties, aes(x=long,y=lat,group=group)) + geompolygon(data=states, aes(x=long,y=lat,group=group), fill=NA,col="white",lwd=0.15) +
coord_quickmap()
- Maps can be customized to improve readability and the aesthetic of the visualizations.
# Practical Example: Pennsylvania County Map
- **Test Yourself:**
- Create a county-level map of Pennsylvania.
- **Answer code:**
```R
pa_counties <- filter(counties,region=="pennsylvania")
ggplot() +
geom_polygon(data=pa_counties, aes(x=long,y=lat,group=group)) +
coord_quickmap()
```
# Enhancing Maps with Data
- Importance of enhancing maps with additional data.
- Source of rich data: U.S. Census provides significant information (e.g., median income, poverty percentages).
- Dataset example: `census_poverty.csv` containing relevant socio-economic data.
- Key method to merge census data with map data lies in FIPS codes:
- **FIPS (Federal Information Processing Specification):** Identifies counties.
- Mapping process requires:
1. Inspecting and merging datasets via FIPS codes.
2. Implementing `mutate` to create `polyname` variable for compatibility.
# Joining Datasets Example
- **Code showing creation of polyname and merging:**
R
countywithfips <- counties %>%
mutate(polyname = paste(region,subregion,sep=",")) %>%
left_join(county.fips, by="polyname")
- Verifying the join: `head(county_with_fips)` outputs:
- Variables combined include: **long, lat, group, region, subregion, polyname, fips**.
# Read and Merge Census Data
- Steps to read in and merge Census data:
- **Read data:**
```R
setwd("~/Dropbox/DATA101")
census_income <- read_csv(file="Data/Processed/Census_Poverty.csv")
```
- **Merge datasets:**
```R
county_with_income <- inner_join(county_with_fips,census_income, by=c("fips"="fips_code"))
```
- Result: Combined dataset ready for mapping visualizations.
# Choropleth Mapping
- **Basic choropleth map creation:**
- Utilize fill aesthetic to indicate data variations:
R
ggplot(data=countywithincome) +
geompolygon(aes(x=long, y=lat, group=group, fill=pctpvty)) +
coord_quickmap()
- Explanation of visualization results:
- Indicates geographic poverty distribution across counties.
- **Choropleth Map Definition:** Shows how a variable varies geographically (Greek origin: area and multitude).
# additional Mapping Capabilities
- Easily create state-level or county-level maps with altered syntax for differences.
- Suggested task: use `median_income` for a different map:
R
ggplot(data=countywithincome) +
geompolygon(aes(x=long, y=lat, group=group, fill=medianincome)) +
coord_quickmap()
# Aesthetic Adjustments in Maps
- Three common aesthetic choices:
1. Boundaries visibility
2. Color schemes
3. Representations of value differences
- Adjusting the color representation by adding color mapping:
R
ggplot(data=countywithincome) +
geompolygon(aes(x=long, y=lat, group=group, fill = pctpvty, color=pctpvty)) + coordquickmap()
- This method ensures both filled area and boundary are matched visually.
# Further Customizations
- Changing the fill gradient color:
R
ggplot(data=countywithincome) +
geompolygon(aes(x=long, y=lat, group=group, fill=pctpvty)) +
scalefillgradient(low="lightyellow", high="darkgreen") +
coord_quickmap()
- Access available colors using `colours()` command in R.
# Discrete vs Continuous Variables
- Transforming continuous variables to discrete for enhanced visual analysis.
- Example: `medinc_quart` variable divided into quartiles to highlight income variations.
# Color Palette Selection for Visualization
- Importance of choosing appropriate color schemes mentioned in color theory by Cynthia Brewer.
- Accessing Brewer’s palette using RColorBrewer:
R
library(RColorBrewer)
display.brewer.all()
- Description of color palettes types:
1. **Sequential Palettes**: For gradual data (low-high)
2. **Qualitative Palettes**: For nominal/categorical data
3. **Diverging Palettes**: For contrasting high-low values.
# Using Diverging Color Scheme
- **Diverging example for income map:**
R
ggplot(data=countywithincome) +
geompolygon(aes(x=long, y=lat, group=group, fill=as.factor(medincquart))) +
coordquickmap() + scalefill_brewer(palette = "RdYlGn", name="County Median Income",
labels=c("Below $42,275", "$42,275 - $48,885", "$48,885 - $56,696", "Above $56,696"))
```
Visual assessment notes:
Clear geographic income dispersion indicated with distinct color gradients.
Summary and Conclusion
Final observations comparing divergent graphs with original graphs.
Potential for further analysis using the maps and additional R packages (ggmap, sf, tmap).
Appendix
All code utilized becomes accessible for reference.
Ensuring understanding is reinforced through practical application of the discussed concepts.
Closing thoughts: Happy map-making!