Skip to main content

Supervised Classification

In the context of Remote Sensing (RS) and Digital Image Processing (DIP), supervised classification is the process where an analyst defines "training sites" (Areas of Interest or ROIs) representing known land cover classes (e.g., Water, Forest, Urban). The computer then uses these training samples to teach an algorithm how to classify the rest of the image pixels.

The algorithms used to classify these pixels are generally divided into two broad categories: Parametric and Nonparametric decision rules.


Parametric Decision Rules

These algorithms assume that the pixel values in the training data follow a specific statistical distribution—almost always the Gaussian (Normal) distribution (the "Bell Curve").

  • Key Concept: They model the data using statistical parameters: the Mean vector ($\mu$) and the Covariance matrix ($\Sigma$).

  • Analogy: Imagine trying to fit a smooth hill over your data points. If a new point lands high up on the hill, it belongs to that class.

Nonparametric Decision Rules

These algorithms make no assumptions about the statistical distribution of the data. They do not care if the data fits a bell curve.

  • Key Concept: They classify based on discrete geometric shapes (polygons, boxes) or the relative position of the data points themselves.

  • Analogy: Imagine drawing a literal box or fence around your data points. If a new point falls inside the fence, it belongs to that class.


A. Minimum-Distance-to-Means (MDM)

  • Classification: Generally considered a simple Parametric classifier (as it relies on the mean parameter), though it operates geometrically.

  • How it works:

    1. The algorithm calculates the spectral mean vector (the center point or centroid) for each training class.

    2. For every unclassified pixel in the image, it calculates the Euclidean distance to the mean of every class.

    3. The pixel is assigned to the class with the shortest distance.


  • Pros: Very fast computationally; mathematically simple.

  • Cons: It is insensitive to the variance (spread) of the data.

    • Example: If "Urban" data is very scattered (high variance) and "Water" is very tight (low variance), a pixel far from the Urban center might actually belong to Urban, but MDM might classify it as Water just because the Water mean is slightly closer geometrically.

B. Parallelepiped Classification

  • Classification: Nonparametric.

  • How it works:

    1. The algorithm looks at the training data and finds the minimum and maximum brightness values for each band.

    2. It creates a rectangular box (a parallelepiped in multi-dimensional space) defined by these limits.

    3. If a pixel's value falls within the box, it is assigned to that class.

  • Pros: Extremely fast; easy to understand conceptually.

  • Cons:

    • The Correlation Problem: Real remote sensing data (like vegetation in Red vs. NIR bands) is often correlated (diagonal distribution). A rectangular box cannot fit a diagonal data cloud efficiently, leading to large "empty corners" in the box that capture noise/wrong pixels.

    • Overlapping: Pixels often fall into the overlapping area of two boxes, leaving the computer unable to decide.

C. Gaussian Maximum Likelihood (GML/MLC)

  • Classification: Parametric (The standard industry workhorse).

  • How it works:

    1. It assumes the data for each class is normally distributed.

    2. It uses both the Mean vector AND the Covariance matrix to calculate the probability density function.

    3. It calculates the statistical probability of a pixel belonging to each class.

    4. It constructs ellipsoidal equiprobability contours (rather than circles or boxes).

  • Pros: Highly accurate because it accounts for the variance (spread) and covariance (correlation/direction) of the data. It handles "diagonal" data clouds perfectly.

  • Cons: Computationally expensive (slow on massive images); requires a large number of training pixels per class to compute a stable covariance matrix (usually $10N$ to $100N$ pixels, where $N$ is the number of bands).


FeatureParallelepipedMinimum DistanceMaximum Likelihood
TypeNonparametricParametric (Simple)Parametric (Advanced)
GeometryRectangular BoxesCircles/SpheresEllipsoids
AssumptionsNone (Min/Max thresholds)Mean Center PointGaussian Distribution
SpeedVery FastFastSlow / Intensive
AccuracyLow to ModerateModerateHigh
Best Used ForQuick looks; Uncorrelated dataWell-separated classesComplex, correlated data


Comments

Popular posts from this blog

Atmospheric Window

The atmospheric window in remote sensing refers to specific wavelength ranges within the electromagnetic spectrum that can pass through the Earth's atmosphere relatively unimpeded. These windows are crucial for remote sensing applications because they allow us to observe the Earth's surface and atmosphere without significant interference from the atmosphere's constituents. Key facts and concepts about atmospheric windows: Visible and Near-Infrared (VNIR) window: This window encompasses wavelengths from approximately 0. 4 to 1. 0 micrometers. It is ideal for observing vegetation, water bodies, and land cover types. Shortwave Infrared (SWIR) window: This window covers wavelengths from approximately 1. 0 to 3. 0 micrometers. It is particularly useful for detecting minerals, water content, and vegetation health. Mid-Infrared (MIR) window: This window spans wavelengths from approximately 3. 0 to 8. 0 micrometers. It is valuable for identifying various materials, incl...

Platforms in Remote Sensing

In remote sensing, a platform is the physical structure or vehicle that carries a sensor (camera, scanner, radar, etc.) to observe and collect information about the Earth's surface. Platforms are classified mainly by their altitude and mobility : Ground-Based Platforms Definition : Sensors mounted on the Earth's surface or very close to it. Examples : Tripods, towers, ground vehicles, handheld instruments. Applications : Calibration and validation of satellite data Detailed local studies (e.g., soil properties, vegetation health, air quality) Strength : High spatial detail but limited coverage. Airborne Platforms Definition : Sensors carried by aircraft, balloons, or drones (UAVs). Altitude : A few hundred meters to ~20 km. Examples : Airplanes with multispectral scanners UAVs with high-resolution cameras or LiDAR High-altitude balloons (stratospheric platforms) Applications : Local-to-regional mapping ...

Scattering

Scattering 

History of GIS

1. 1832 - Early Spatial Analysis in Epidemiology:    - Charles Picquet creates a map in Paris detailing cholera deaths per 1,000 inhabitants.    - Utilizes halftone color gradients for visual representation. 2. 1854 - John Snow's Cholera Outbreak Analysis:    - Epidemiologist John Snow identifies cholera outbreak source in London using spatial analysis.    - Maps casualties' residences and nearby water sources to pinpoint the outbreak's origin. 3. Early 20th Century - Photozincography and Layered Mapping:    - Photozincography development allows maps to be split into layers for vegetation, water, etc.    - Introduction of layers, later a key feature in GIS, for separate printing plates. 4. Mid-20th Century - Computer Facilitation of Cartography:    - Waldo Tobler's 1959 publication details using computers for cartography.    - Computer hardware development, driven by nuclear weapon research, leads to broader mapping applications by early 1960s. 5. 1960 - Canada Geograph...

History of GIS

The history of Geographic Information Systems (GIS) is rooted in early efforts to understand spatial relationships and patterns, long before the advent of digital computers. While modern GIS emerged in the mid-20th century with advances in computing, its conceptual foundations lie in cartography, spatial analysis, and thematic mapping. Early Roots of Spatial Analysis (Pre-1960s) One of the earliest documented applications of spatial analysis dates back to  1832 , when  Charles Picquet , a French geographer and cartographer, produced a cholera mortality map of Paris. In his report  Rapport sur la marche et les effets du cholĂ©ra dans Paris et le dĂ©partement de la Seine , Picquet used graduated color shading to represent cholera deaths per 1,000 inhabitants across 48 districts. This work is widely regarded as an early example of choropleth mapping and thematic cartography applied to epidemiology. A landmark moment in the history of spatial analysis occurred in  1854 , when  John Snow  inv...