Skip to content

Step 5 – Feature Extraction (The Morphometrics Phase)

5 - Feature Extraction (The Morphometrics Phase)

Section titled “5 - Feature Extraction (The Morphometrics Phase)”

Scikit-Image (The Measuring Tape) & Pillow (The Image Manipulator)

Scikit-Image (skimage) is a collection of scientific image-processing algorithms. It focuses on measurement: geometric features (area, perimeter, eccentricity) and texture features (Haralick/GLCM, Local Binary Patterns). Pillow handles basic image I/O and simple transformations before measurement.

This is your “Digital Ruler and Scale.” A pathologist says “The nuclei are large, irregular, hyperchromatic.” A computer only understands numbers. Scikit-Image translates adjectives into measurements:

  • “Large” → Area = 450 pixels
  • “Irregular” → Circularity = 0.45
  • “Hyperchromatic” → Mean_Intensity = 20 (dark)
  • Morphometry: Measure size and shape of every cell.
  • Texture Analysis: Quantify “roughness” vs “smoothness” (for example, stroma vs tumor).
  • Color Quantization: Measure how “blue” a nucleus is (DNA content).

Situations where it’s used (Medical Examples)

Section titled “Situations where it’s used (Medical Examples)”
  1. Grading Cancer: For nuclear pleomorphism, measure 1,000 nuclei and compute size variability.
  2. Stromal Analysis: Use texture features (Haralick) to show tumor stroma is more chaotic than normal stroma.

This step turns “It looks bad” into “Nuclei are 2.5× larger than normal.” Quantitative pathology needs numbers, not just adjectives.

Run in terminal:

Terminal window
pip install scikit-image pillow numpy matplotlib

Block A: Geometric Features (Measuring Shape)

Section titled “Block A: Geometric Features (Measuring Shape)”

The Situation: You have a binary mask of a nucleus and want to know if it is “good” (round) or “bad” (irregular).
The Solution: Use measure.regionprops to compute properties and circularity.

import numpy as np
from skimage import measure
import math
# 1. Simulate an irregular nucleus mask (in practice, use your segmentation output)
mask = np.zeros((100, 100), dtype=np.uint8)
mask[30:70, 30:70] = 1 # square (not a circle)
# 2. Label blobs and compute properties
label_img = measure.label(mask)
props = measure.regionprops(label_img)
nucleus = props[0]
# 3. Circularity: (4 * pi * area) / (perimeter^2)
perimeter = nucleus.perimeter
area = nucleus.area
circularity = (4 * math.pi * area) / (perimeter ** 2)
print(f"Nucleus Area: {area} pixels")
print(f"Nucleus Perimeter: {perimeter:.2f} pixels")
print(f"Circularity Score: {circularity:.2f}")
if circularity < 0.8:
print("Conclusion: Irregular Shape (Possible Atypia)")
else:
print("Conclusion: Round Shape (Benign)")

Simulated output:

Nucleus Area: 1600 pixels
Nucleus Perimeter: 160.00 pixels
Circularity Score: 0.79
Conclusion: Irregular Shape (Possible Atypia)

The Situation: You want to separate “Normal Collagen” (smooth) from “Desmoplasia” (chaotic).
The Solution: Use the gray-level co-occurrence matrix (GLCM) to measure contrast and homogeneity.

from skimage.feature import graycomatrix, graycoprops
from skimage import data
# 1. Load a sample texture (replace with your grayscale tissue patch)
image = data.gravel()
# 2. Calculate GLCM: distance=1 pixel, angle=0 degrees (to the right)
glcm = graycomatrix(
image,
distances=[1],
angles=[0],
levels=256,
symmetric=True,
normed=True,
)
# 3. Extract features
contrast = graycoprops(glcm, "contrast")[0, 0] # roughness
homogeneity = graycoprops(glcm, "homogeneity")[0, 0] # smoothness
print(f"Texture Contrast: {contrast:.2f}")
print(f"Texture Homogeneity: {homogeneity:.2f}")
if contrast > 50:
print("Conclusion: High texture variation (Rough/Chaotic)")
else:
print("Conclusion: Low texture variation (Smooth)")

Simulated output:

Texture Contrast: 125.40
Texture Homogeneity: 0.35
Conclusion: High texture variation (Rough/Chaotic)

Block C: Intensity Features (Hyperchromasia)

Section titled “Block C: Intensity Features (Hyperchromasia)”

The Situation: You need to quantify how dark nuclei are (hyperchromasia).
The Solution: Measure mean intensity; darker nuclei have lower mean values (0 = black, 255 = white).

import numpy as np
# 1. Simulate nuclei (grayscale)
nucleus_roi_dark = np.full((10, 10), 50, dtype=np.uint8) # dark
nucleus_roi_light = np.full((10, 10), 200, dtype=np.uint8) # light
# 2. Measure mean intensity
mean_intensity_dark = float(np.mean(nucleus_roi_dark))
mean_intensity_light = float(np.mean(nucleus_roi_light))
print(f"Nucleus A Intensity: {mean_intensity_dark} (Darker)")
print(f"Nucleus B Intensity: {mean_intensity_light} (Lighter)")
# 3. Decision rule (tune threshold to your data)
if mean_intensity_dark < 100:
print("Nucleus A is Hyperchromatic.")
else:
print("Nucleus A is Normal.")

Simulated output:

Nucleus A Intensity: 50.0 (Darker)
Nucleus B Intensity: 200.0 (Lighter)
Nucleus A is Hyperchromatic.