Image Segmentation and Morphological Image Processing Notes

Image Segmentation

Image segmentation is partitioning an image into meaningful regions for simplified analysis and interpretation, used in computer vision, medical imaging, object detection, and pattern recognition.

Segmentation Categories:
- Discontinuity-based (Edge detection)
- Similarity-based (Thresholding, Region-based methods)

Detection of Discontinuities (Edge Detection)

Discontinuities indicate rapid intensity changes, corresponding to object boundaries.

Types of Discontinuities:
- Point Discontinuity: Isolated pixels with different intensity values.
- Line Discontinuity: Edges along lines.
- Edge Discontinuity: Intensity differences between regions.

Point Detection

Detected when a pixel's intensity differs significantly from surrounding pixels.
Laplacian Operator is used to detect isolated points.
Point Detection Formula:
$L(x,y) = f(x+1,y) + f(x-1,y) + f(x,y+1) + f(x,y-1) - 4f(x,y)$
If $L(x, y)$ exceeds a threshold, the pixel is a point of interest.

Line Detection

Occurs when pixels form a line with different intensity values from the background.
Line detection masks are used.
Example Line Detection Masks:
- Horizontal Mask: $[-1, -1, -1; 2, 2, 2; -1, -1, -1]$
- Vertical Mask: $[-1, 2, -1; -1, 2, -1; -1, 2, -1]$
- Diagonal Mask: $[2, -1, -1; -1, 2, -1; -1, -1, 2]$
Convolve masks with the image; high values indicate lines.

Edge Detection

Edges represent significant intensity changes, indicating object boundaries.

Gradient-Based Methods

Sobel Operator
- Uses 3x3 kernels to compute gradients in x and y directions.
- Gradient magnitude: $G = \sqrt{G<em>x^2 + G</em>y^2}$
- Direction of edge: $\theta = tan^{-1} \frac{G<em>y}{G</em>x}$
Prewitt Operator
- Similar to Sobel but less sensitive to noise.
Roberts Operator
- Uses a 2x2 kernel, computationally efficient but sensitive to noise.

Second-Order Derivative Method

Laplacian Operator
- Detects edges by computing the second derivative.
- Mask: $H = [0, -1, 0; -1, 4, -1; 0, -1, 0]$
- Enhances edges but is sensitive to noise.

Canny Edge Detector (Most Advanced)

Developed by John F. Canny in 1986, it detects edges with precision while minimizing false detections.

Steps:
1. Noise Reduction (Gaussian Blur)
  - Smooth image using a Gaussian filter to reduce noise effects.
2. Compute Gradient (Sobel Operator)
  - Apply Sobel operator to calculate gradient magnitude and direction.
  - Identifies areas of rapid intensity change.
3. Non-Maximum Suppression (NMS)
  - Thins edges by removing non-maximum pixels in the gradient direction.
  - Keeps only local maxima.
4. Double Thresholding
  - Applies high and low thresholds to classify pixels:
    - Strong edges: Above the high threshold.
    - Weak edges: Between the low and high thresholds.
    - Non-edges: Below the low threshold, discarded.
5. Edge Tracking by Hysteresis
  - Weak edges are checked for connectivity to strong edges.
  - Preserves connected weak edges; removes others.
Advantages:
- Low error rate: Detects true edges, avoids false edges.
- Well-defined edges: Produces thin edges with good localization.
- Robust to noise: Uses Gaussian filtering.

Edge Detection Operators

Roberts Operator
- Uses a 2x2 gradient filter.
- Detects diagonal edges.
Sobel Operator
- Uses 3x3 kernels to detect horizontal and vertical edges.
- More sensitive to noise but provides stronger edges.
Prewitt Operator
- Similar to Sobel but less computationally expensive.
Canny Edge Detector (Most Popular)
- Steps:
  - Gaussian smoothing to reduce noise.
  - Gradient computation using Sobel filter.
  - Non-maximum suppression to refine edges.
  - Hysteresis thresholding for strong and weak edge linking.

Edge Linking and Boundary Detection

Connect detected edges to form meaningful boundaries.

Edge Linking

Ensures detected edges form continuous contours.

Techniques:
- Hysteresis Thresholding (Used in Canny)
  - Uses high and low thresholds.
  - Keeps strong edges (above high threshold).
  - Keeps weak edges (between thresholds) if connected to strong edges.
  - Discards edges below the low threshold.
- Connectivity Analysis
  - Uses 4-connectivity or 8-connectivity.
    - 4-connectivity: Checks left, right, top, and bottom neighbors.
    - 8-connectivity: Includes diagonal neighbors.
- Graph-Based Methods
  - Treats edges as nodes in a graph.
  - Uses algorithms like Dijkstra’s shortest path or minimum spanning trees.

Boundary Detection

Extends edge detection to form closed contours for identifying object shapes.

Techniques:
- Contour Tracing (Moore-Neighbor Algorithm)
  - Follows detected edges to extract a continuous boundary.
  - Used in object segmentation.
- Active Contours (Snakes Algorithm)
  - Uses an energy-minimizing curve that evolves toward object boundaries.
  - Adjusts based on image gradient and smoothness constraints.
- Hough Transform (For Detecting Lines & Shapes)
  - Maps edge points into a parameter space to detect shapes (e.g., lines, circles).
  - Effective for detecting straight boundaries like roads or buildings.

Filtering with Local Masks (Convolution Filters)

A small kernel (e.g., 3x3 or 5x5) is convolved with the image.
Common Filters:
- Smoothing (Low-pass filter) → Reduces noise.
- Sharpening (High-pass filter) → Enhances edges.
- Edge detection (Gradient filters) → Detects boundaries.

Thresholding

A simple and effective segmentation method based on pixel intensity levels.

Types of Thresholding
1. Global Thresholding
  - A single threshold value (T) is applied to the entire image:
    $g(x,y) = \begin{cases} 1, & f(x, y) \geq T \ 0, & f(x, y) < T \end{cases}$
  - Example: Otsu’s method (automatic threshold selection).
2. Adaptive Thresholding
  - Different threshold values are applied to different regions.
  - Useful for images with varying illumination.
3. Multi-Level Thresholding
  - Segments an image into multiple intensity ranges.

Region-Oriented Segmentation

Region-based methods group similar pixels into regions instead of detecting edges.

Region Growing

A pixel (seed) is selected, and neighboring pixels with similar properties are added to the region.
Criteria:
- Intensity similarity
- Texture or color similarity

Region Splitting and Merging

The image is divided into quadrants recursively until a homogeneity criterion is met.
Adjacent similar regions are then merged.

Watershed Algorithm

Views the intensity of an image as a topographic surface.
Identifies ridges (boundaries) and basins (regions).
Used for medical image segmentation.

Techniques for Region-Oriented Segmentation

Region Growing

Starts from one or more seed points (initial pixels) and expands outward by including similar neighboring pixels.
The growth continues until no more pixels satisfy the similarity condition (e.g., intensity or color threshold).
Steps:
1. Choose one or multiple seed pixels.
2. Compare neighboring pixels to the region based on similarity (e.g., color, brightness).
3. Add similar pixels to the region.
4. Repeat the process until no more pixels can be added.
Advantages & Disadvantages:
- Works well for objects with uniform color or texture.
- Simple and effective for well-defined regions.
- Sensitive to noise, which may cause overgrowth.
- Seed selection is crucial—bad seeds may lead to poor segmentation.

Region Splitting and Merging

Starts with the entire image and divides it into smaller parts, then merges similar regions.
Steps:
1. Start with the whole image as one region.
2. Split the image into smaller regions if they are not homogeneous (contain different intensity values).
3. Merge adjacent regions if they have similar properties.
4. Repeat until no more splitting or merging is needed.
Advantages & Disadvantages:
1. Handles complex images better than region growing.
2. Reduces over-segmentation problems.
3. More computationally expensive than region growing.
4. Deciding when to stop splitting/merging can be difficult.

Comparison of Region-Based Methods

Method	Description	Best Use Case	Limitations
Region Growing	Expands from a seed pixel to include similar neighbors	Objects with uniform colors or textures	Sensitive to noise, requires a good seed
Region Splitting & Merging	Recursively divides & merges regions based on similarity	Complex images with varying textures	Computationally expensive

Where Are These Methods Used?

Medical Imaging (segmenting organs, tumors, etc.)
Remote Sensing (analyzing satellite images)
Object Recognition (identifying objects in images)
Face Detection (separating facial features)

Morphological Image Processing

A set of operations that process images based on shapes and structures, used for noise removal, shape extraction, and object segmentation, especially in binary images.

Basic Morphological Operations

Dilation (Expanding Objects)

Adds pixels to the boundaries of objects, making them larger, used for bridging small gaps, connecting broken objects, or highlighting features.

Effect: Thickens objects in the image.

Erosion (Shrinking Objects)

Removes pixels from object boundaries, making them smaller, used for removing noise, separating objects, and removing thin structures.

Effect: Thins objects in the image.

Opening (Erosion → Dilation)

Removes small objects (noise) while keeping larger structures intact. First, erodes the image to remove noise, then dilates it to restore main structures.

Effect: Removes small noise but keeps the original shape.

Closing (Dilation → Erosion)

Fills small holes and gaps inside objects. First, dilates the image to close gaps, then erodes to restore the shape.

Effect: Closes small holes in objects.

Advanced Morphological Operations

Hit-or-Miss Transformation

Detects specific shapes or patterns in an image.
Used in template matching for detecting objects of certain shapes.

Morphological Gradient

Finds the difference between dilation and erosion.
Highlights the edges of objects.
Used in edge detection and object contour extraction.

Top-Hat and Black-Hat Transformations

Top-Hat: Extracts bright regions from the background.
Black-Hat: Extracts dark regions from the background.
Used in illumination correction in medical and document images.

Structuring Elements (SE)

Morphological operations use a structuring element to process the image. Different structuring elements affect how shapes in the image are modified.

Types of Structuring Elements:
- Square: Good for general processing.
- Disk (Circle): Preserves round structures.
- Cross: Detects thin structures like cracks or lines.

Comparison of Morphological Operations

Operation	Purpose	Effect
Dilation	Expands objects	Thickens objects
Erosion	Shrinks objects	Thins objects
Opening	Removes small noise	Preserves main shape
Closing	Fills small holes	Keeps objects intact

Where is Morphological Processing Used?

Medical Imaging – Extracting cells, removing noise from X-rays.
Character Recognition (OCR) – Enhancing text for better readability.
Biometrics (Fingerprint Recognition) – Refining fingerprint structures.
Industrial Inspection – Detecting defects in manufactured products.
Remote Sensing – Extracting objects from satellite images.

Dilation and Erosion

Dilation and erosion are the two fundamental morphological operations used in binary and grayscale image processing.

Dilation (Expanding Objects)

Adds pixels to the boundaries of objects, making them thicker and larger.
Used to fill small gaps, connect broken parts, and highlight important features.
How It Works:
- A structuring element (SE) is placed at each pixel in the image.
- If at least one pixel inside the SE overlaps with a white (foreground) pixel, the central pixel becomes white.
- This results in the object expanding outward.
Effects of Dilation
- Fills small holes inside objects.
- Connects broken parts of an object.
- Enhances boldness of text (useful in OCR).
- Highlights bright regions in an image.
Example Applications
- Text Recognition (OCR): Enhances letters by thickening them.
- Medical Imaging: Enhances blood vessels or tumors.
- Object Detection: Strengthens object features for better segmentation.

Erosion (Shrinking Objects)

Removes pixels from the boundaries of objects, making them thinner and smaller.
Used to remove noise, detach connected objects, and refine shapes.
How It Works
- A structuring element is placed at each pixel.
- If all pixels under the SE match the foreground, the central pixel stays white; otherwise, it turns black.
- This results in shrinking objects and removing small noise.
Effects of Erosion
- Reduces object size.
- Removes small noise (isolated white pixels).
- Separates connected objects.
- Helps extract thin features (like fingerprint ridges).
Example Applications
- Noise Removal: Eliminates small white specks in a binary image.
- Fingerprint Recognition: Extracts thin ridge structures.
- Medical Imaging: Removes small, irrelevant structures in scans.

Comparison of Dilation vs. Erosion

Operation	Effect on Objects	Use Case
Dilation	Enlarges objects, fills gaps	Text enhancement, object strengthening
Erosion	Shrinks objects, removes small noise	Noise removal, feature extraction

When to Use Dilation or Erosion?

Use Dilation when you want to connect parts, highlight features, or thicken objects.
Use Erosion when you want to remove noise, separate objects, or thin structures.

Structuring Element Decomposition in Morphological Processing

Breaks a complex SE into simpler SEs to make morphological operations faster and more efficient.

What is Structuring Element Decomposition?

Instead of using a large or complex structuring element directly, we decompose it into smaller, simpler elements (e.g., lines, crosses).

Key Idea: Performing multiple morphological operations with smaller, simpler SEs is faster than using a large SE all at once.

Types of Structuring Element Decomposition

Rectangular Decomposition

A large rectangular SE (e.g., 5x5) can be decomposed into two 1D structuring elements (a row and a column).
Example:
- Instead of using a 5x5 SE, we can decompose it into:
  1. A horizontal line (1x5)
  2. A vertical line (5x1)

Disk (Circular) Decomposition

Instead of using a full disk, we approximate it using a series of smaller crosses or lines.
Reduces computation time while maintaining the circular shape.

Hit-or-Miss Decomposition

Complex patterns are detected by breaking them into smaller, simpler patterns using multiple structuring elements.

Benefits of Structuring Element Decomposition

Speeds up morphological operations
Reduces computational cost
Preserves accuracy while improving efficiency
Used in:
- Text processing (OCR)
- Medical imaging
- Object detection in satellite images

Hit-or-Miss Transformation in Morphological Processing

Used to detect specific shapes or patterns in a binary image.

What is the Hit-or-Miss Transformation?

Checks whether a given pattern exactly matches a region in the image.

Uses two structuring elements (SEs):
- Hit SE → Matches the object shape.
- Miss SE → Matches the background around the object.
If both SEs fit perfectly, the transformation marks that location as a match.

How Does It Work?

Eroding the image with the "Hit" SE (foreground match).
Eroding the complement (inverse) of the image with the "Miss" SE (background match).
Finding the intersection of the results.

Applications of the Hit-or-Miss Transformation

Shape Detection – Finds specific patterns in an image (e.g., circles, corners).
Feature Extraction – Identifies junctions, corners, and edges in objects.
Character Recognition (OCR) – Detects specific letter shapes.
Medical Imaging – Identifies specific cell structures.

Example of Hit-or-Miss

Imagine detecting an L-shaped pattern in an image:
- Hit SE → Matches the L shape.
- Miss SE → Matches the surrounding empty space.
- If both match, the transformation marks the location as a hit.

Comparison with Other Morphological Operations

Operation	Purpose	Effect
Dilation	Expands objects	Thickens shapes
Erosion	Shrinks objects	Thins shapes
Hit-or-Miss	Finds exact shapes	Detects specific patterns