Advanced Raster Analysis and Spatial Statistics in ArcGIS

Spatial Analyst and Raster Data Models Review

Raster Data Model Recap: * Review of the raster data model and various forms it takes. * Discussion of digital elevation models (DEMs) and other elevation representations. * Introduction to the Spatial Analyst extension for ArcGIS, a toolbox specifically designed for analyzing and manipulating raster data.
Hillshades and Surface Modeling: * Hillshades are grayscale 3D representations of a surface. * They account for the sun's relative position to shade the image. * The function uses specific properties to define the sun's location: * Altitude * Azimuth

Distance Functions in ArcGIS

Overview of Distance Analysis: There are two primary methods for performing distance analysis in ArcGIS: Straight line distance and Cost weighted distance.
Straight Line (Euclidean) Distance: * Calculates the distance from each cell to the closest source. * Source Definition: Identifies objects of interest such as oil wells, roads, or forest stands. * Calculations are performed as "the crow flies" using the projection units of the raster. * Distance is computed from the center of the source cell to the center of each surrounding cell. * If multiple sources exist, the cell value represents the distance to the single nearest source. * Source Format: If the source is a raster, it must contain only source values while all other cells are set to "No Data." If the source is a vector feature class, ArcGIS converts it to a raster internally during the process. * Standalone Applications: Used for emergency flight planning (e.g., finding the nearest hospital). * Suitability Analysis: Used for finding distances to specific features, such as a red-cockaded woodpecker cavity tree.
The Three Euclidean Tools: * Euclidean Distance: Provides the distance from every cell to the nearest source. * Euclidean Direction: Assigns a value in degrees ( $0$ to $360$ ) to each cell representing the direction to the nearest source. * A circle/compass is used: North is $360$ , East is $1$ , and values increase clockwise. * The value $0$ is reserved specifically for the source cells themselves. * Example: Direction from a hiker's location to the nearest town for medical evacuation. * Example Angles: Source 1 at $15^{\circ}$ , Source 2 at $135^{\circ}$ , Source 3 at $320^{\circ}$ . * Euclidean Allocation: Creates a raster where every cell receives the value of the source feature it is closest to. * Partitions the surface into zones/areas dedicated to one feature (e.g., store or hospital service areas).

Cost Weighted Distance Analysis

Definition: Modifies Euclidean distance by incorporating a cost factor—the effort or economic expense required to travel through any given cell. * Example: It might be shorter to climb over a mountain, but faster (lower cost) to walk around it.
Cost Surfaces: * Represent factors affecting travel like terrain slope, snow depth, or financial cost. * Slope as a Factor: Steep terrain increases road construction costs. These values are often transformed into rank values (e.g., a common scale of $1$ through $9$ ). * Value $1$ : Very low travel cost. * Value $9$ : Highest travel cost. * Ranking note: A value of $9$ is not necessarily nine times more costly than $1$ ; it is simply the most costly value in the index. * An analysis can combine multiple cost surfaces (e.g., slope and snow depth) into a single master cost surface.
Cost Distance Tool Logic: * Calculates the least cumulative cost distance for each cell to the nearest source. * The tool evaluates neighbors starting from the source, multiplying the average cost between cell pairs by the distance between them. * The process iteratively moves to the cell with the lowest value, evaluating unknown neighbors.
Cost Back Link Tool: * Produces a direction raster showing which way to go from any cell to reach the source via the least-cost path. * Output values range from $0$ to $8$ . * $0$ : Reserved for the source cell. * $1$ : Reach the next cell by moving Right. * $2$ : Diagonally to the Lower Right. * $3$ : To the cell Below. * $4$ : Diagonally to the Lower Left. * This provides the sequence of cells for a "roadmap" back to the source.
Cost Path Generation: * Requires both a cost-weighted distance surface and a direction surface. * Evaluates eight neighbors at each cell and moves to the neighbor with the smallest accumulated value until the source and destination are connected.

Density Tools and Surface Spreading

Concept: Spreads the values of input features over a surface to show where points are concentrated.
Mechanism: A circular search area/radius is applied around each sample point. * Search Radius Impact: A larger radius results in a smoother surface and spreads point values over a wider area, leading to a less dense appearing surface.
Population Distribution Example: * Calculating density for town population points shows the predicted spread throughout a landscape rather than humans living at a single point coordinate. * The sum of values in the output density cells equals the sum of the population in the original point layer.

Spatial Operation Categories

Local Operations: * Simplest map algebra functions performed cell-by-cell. * Only involves the single cell location across participating rasters (e.g., adding two rasters for a sum).
Focal (Neighborhood) Operations: * Computes an output value for a cell based on neighborhood values. * Also known as a "moving window" operation. * Standardly results in smoothed values.
Zonal Operations: * Computes output values based on "zones" (groups of cells sharing a common characteristic, such as a watershed, county, or soil series). * Areas in a zone do not need to be contiguous (can be separate "regions").
Global Operations: * Performs functions using all cells of the input raster to determine the value of each output cell. * Example: Euclidean distance.

Statistics Tools in Spatial Analyst

Cell Statistics (Local): * Calculates per-cell statistics from two or more input rasters. * Available statistics: SUM, MEAN, MAXIMUM, MINIMUM, RANGE, STD (Standard Deviation), VARIANCE, MEDIAN, etc. * The No Data Rule: By default, if any input cell in an expression is "No Data," the result is "No Data." * Example: Summing $100$ rasters where one has "No Data" at a specific cell results in "No Data." * Exception: Users can toggle the "Ignore No Data" option in the dialog box to calculate based on available values.
Neighborhood (Focal) Statistics: * Uses a moving window (default is a $3 \times 3$ rectangle). * Available Shapes: * Rectangle/Square. * Annulus (donut-shaped with inner and outer radii). * Circle. * Wedge (section of a circle). * Irregular (defined by a user-specified file).
Zonal Statistics: * Calculates statistics (e.g., minimum, mean) for a "value raster" based on zones defined in a separate dataset. * Tools: * Zonal Statistics: Outputs a raster where each cell in a zone gets the calculated value (e.g., the minimum suitability value for an entire watershed zone). * Zonal Statistics to Table: Outputs statistics to a non-spatial table. * Zonal Histogram: Produces a table and graph of frequency distribution of cell values within each zone.

Map Algebra and Raster Calculator

Raster Calculator: * The primary tool in the Map Algebra toolset. * Uses Python syntax in a calculator interface. * Allows the execution of complex spatial analyst tools and logical operators in a single expression. * Highly effective for integration into Model Builder.

Reclassification Methods

Purpose of Reclassifying: * Assigning Preference: Changing land use types to values representing habitat quality (e.g., Forest = $3$ , Pasture = $2$ , Residential = $1$ , Urban = No Data). * Grouping Values: Merging multiple forest species codes into a single "Evergreen" class. * Normalizing Scales: Setting multiple layers to a common scale (e.g., $1$ to $10$ ) for suitability analysis to ensure "apples to apples" comparisons. * Data Management: Setting specific values to "No Data" or vice versa.
Combining Layers with Reclassification: * Problem: Adding two binary rasters (e.g., Summer Dry/Wet and Winter Dry/Wet where $1 = ext{Dry}$ and $2 = ext{Wet}$ ) can cause data loss. * Using $1+2=3$ and $2+1=3$ makes it impossible to tell if a cell was dry in summer or dry in winter. * Solution: Reclassify one layer to a different magnitude (e.g., Summer values $10$ and $20$ , Winter values $1$ and $2$ ). * Results in unique totals: $11$ (Dry/Dry), $12$ (Dry/Wet), $21$ (Wet/Dry), $22$ (Wet/Wet).

Spatial Interpolation

Mechanism: Predicts values for unsampled raster cells based on a limited number of sample points.
Spatial Dependency: Interpolation assumes that closer points are more similar than distant points. * Rainfall Analogy: High confidence that it is raining on the other side of the street if it is raining here; less confidence for the other side of town; low confidence for a different country.
Requirement: Requires a sufficient density of representative points to create an accurate surface; inadequate sampling leads to inaccurate predictive results.