Skip to main content

Image Classification → Steps


Assembling the Training Data

Training data (also called training samples or signature sets) are the foundation of supervised image classification in remote sensing.
This is where the analyst selects representative examples of each land-cover class—such as water, vegetation, urban, soil, etc.—from the satellite image.

To prepare training data properly, several analytical and interactive steps are used. These help ensure that the classes are well separated and that the classifier receives the correct spectral information.

1. Graphical Representation of Spectral Response Patterns

✔ What it means

For each class (e.g., water, forest, built-up), the training pixels have a spectral signature—a pattern of reflectance values across the image's spectral bands.

This pattern is visualized using:

  • Spectral reflectance curves

  • Band-by-band scatter plots

  • Histograms for each band

✔ Purpose

  • To understand how different classes behave in different bands

  • To check if the selected training pixels are spectrally consistent

  • To identify overlaps between classes (e.g., dark soil and turbid water)

✔ Key terminology

  • Spectral profile / spectral signature

  • Spectral separability

  • Spectral scatterplot

  • Feature space

2. Quantitative Expressions of Category Separation

This step uses mathematical measures to check if classes are well-separated in spectral space.

✔ Why it matters

Classification accuracy depends on how distinct one class is from another.
If training classes overlap too much, classification errors will occur.

✔ Common quantitative measures

  • Transformed Divergence (TD)

  • Jeffries–Matusita Distance (JM)

  • Bhattacharyya Distance (BD)

✔ What these values indicate

  • Values close to 2.0 (JM scale) → excellent class separability

  • Values close to 0.0 → poor separability; classes overlap

  • Helps decide whether to:

    • combine classes

    • redefine training samples

    • collect more samples

    • split a mixed class

✔ Key terminology

  • Separability index

  • Statistical distance

  • Cluster separation

  • Spectral overlap

3. Self-Classification of the Training Data Set

✔ Concept

Before performing classification on the full image, the classifier is run only on the training pixels themselves.

✔ Purpose

  • To check if the algorithm correctly "recognizes" the classes it was trained on.

  • If the classifier mislabels the training samples, the training data need to be corrected.

✔ What it reveals

  • Misclassified pixels → inaccurate training sets

  • Mixed or overlapping classes

  • Inconsistencies in attribute statistics (means, variances)

  • Too much variability within a class

✔ Key terminology

  • Internal accuracy check

  • Confusion among training classes

  • Spectral homogeneity

4. Interactive Preliminary Classification

✔ What it is

A rough or temporary classification is generated on the image using preliminary training samples.

✔ Purpose

  • To visually inspect how the training data behave when applied to the entire image

  • To refine training sites

  • To identify new sub-classes or remove misidentified ones

✔ What the analyst checks

  • Are water bodies correctly classified?

  • Are vegetation areas split properly (forest vs cropland)?

  • Are built-up areas being confused with dry soil?

✔ Why "interactive"?

The analyst reviews the output and actively adjusts:

  • training polygons

  • class definitions

  • band combinations

  • class separability

✔ Key terminology

  • Pre-classification map

  • Trial classification

  • Interactive refinement

5. Representative Subscene Classification

✔ Concept

Instead of classifying the whole image, a small but representative subscene is used.

A subscene:

  • contains all major land-cover types

  • captures geographic and spectral variability

  • is easier to evaluate and test

✔ Purpose

  • To test classifier performance on a manageable area

  • To refine spectral signatures before final classification

  • To avoid wasting processing time on the full image if training data are weak

✔ What it helps detect

  • Class confusion in specific regions

  • Spectral variability across the scene

  • Need for more training samples

  • Problems with similar classes (e.g., shallow water vs wet soil)

✔ Key terminology

  • Subscene

  • Training refinement

  • Pilot classification

  • Signature validation


Assembling training data for supervised image classification involves:

  1. Graphical representation of spectral response patterns – using spectral curves, histograms, and scatter plots to visualize class behavior.

  2. Quantitative expressions of category separation – using statistical measures (JM, TD, BD) to evaluate how distinct classes are.

  3. Self-classification of training data – testing if the classifier correctly labels its own training samples.

  4. Interactive preliminary classification – producing a trial classification to visually refine training sites.

  5. Representative subscene classification – testing the classifier on a smaller, diverse image subset to check accuracy and refine signatures.


Comments

Popular posts from this blog

Supervised Classification

Image Classification in Remote Sensing Image classification in remote sensing involves categorizing pixels in an image into thematic classes to produce a map. This process is essential for land use and land cover mapping, environmental studies, and resource management. The two primary methods for classification are Supervised and Unsupervised Classification . Here's a breakdown of these methods and the key stages of image classification. 1. Types of Classification Supervised Classification In supervised classification, the analyst manually defines classes of interest (known as information classes ), such as "water," "urban," or "vegetation," and identifies training areas —sections of the image that are representative of these classes. Using these training areas, the algorithm learns the spectral characteristics of each class and applies them to classify the entire image. When to Use Supervised Classification:   - You have prior knowledge about the c...

Hazard Mapping Spatial Planning Evacuation Planning GIS

Geographic Information Systems (GIS) play a pivotal role in disaster management by providing the tools and frameworks necessary for effective hazard mapping, spatial planning, and evacuation planning. These concepts are integral for understanding disaster risks, preparing for potential hazards, and ensuring that resources are efficiently allocated during and after a disaster. 1. Hazard Mapping: Concept: Hazard mapping involves the process of identifying, assessing, and visually representing the geographical areas that are at risk of certain natural or human-made hazards. Hazard maps display the probability, intensity, and potential impact of specific hazards (e.g., floods, earthquakes, hurricanes, landslides) within a given area. Terminologies: Hazard Zone: An area identified as being vulnerable to a particular hazard (e.g., flood zones, seismic zones). Hazard Risk: The likelihood of a disaster occurring in a specific location, influenced by factors like geography, climate, an...

Scope of Disaster Management

Disaster management refers to the systematic approach to managing and mitigating the impacts of disasters, encompassing both natural hazards (e.g., earthquakes, floods, hurricanes) and man-made disasters (e.g., industrial accidents, terrorism, nuclear accidents). Its primary objectives are to minimize potential losses, provide timely assistance to those affected, and facilitate swift and effective recovery. The scope of disaster management is multifaceted, encompassing a series of interconnected activities: preparedness, response, recovery, and mitigation. These activities must be strategically implemented before, during, and after a disaster. Key Concepts, Terminologies, and Examples 1. Awareness: Concept: Fostering public understanding of potential hazards and appropriate responses before, during, and after disasters. This involves disseminating information about risks, safety measures, and recommended actions. Terminologies: Hazard Awareness: Recognizing the types of natural...

Role of Geography in Disaster Management

Geography plays a pivotal role in disaster management by facilitating an understanding of the impact of natural disasters, guiding preparedness efforts, and supporting effective response and recovery. By analyzing geographical features, environmental conditions, and historical data, geography empowers disaster management professionals to identify risks, plan for hazards, respond to emergencies, assess damage, and monitor recovery. Geographic Information Systems (GIS) serve as crucial tools, providing critical spatial data for informed decision-making throughout the disaster management cycle. Key Concepts, Terminologies, and Examples 1. Identifying Risk: Concept: Risk identification involves analyzing geographical areas to understand their susceptibility to specific natural disasters. By studying historical events, topography, climate patterns, and environmental factors, disaster management experts can predict which regions are most vulnerable. Terminologies: Hazard Risk: The pr...

Supervised Classification

In the context of Remote Sensing (RS) and Digital Image Processing (DIP) , supervised classification is the process where an analyst defines "training sites" (Areas of Interest or ROIs) representing known land cover classes (e.g., Water, Forest, Urban). The computer then uses these training samples to teach an algorithm how to classify the rest of the image pixels. The algorithms used to classify these pixels are generally divided into two broad categories: Parametric and Nonparametric decision rules. Parametric Decision Rules These algorithms assume that the pixel values in the training data follow a specific statistical distribution—almost always the Gaussian (Normal) distribution (the "Bell Curve"). Key Concept: They model the data using statistical parameters: the Mean vector ( $\mu$ ) and the Covariance matrix ( $\Sigma$ ) . Analogy: Imagine trying to fit a smooth hill over your data points. If a new point lands high up on the hill, it belongs to that cl...