Skip to content

Step 2 – Slides & Viewing (The Digital Microscope)

2 - Slides & Viewing (The Digital Microscope)

Section titled “2 - Slides & Viewing (The Digital Microscope)”

2A - Viewing Whole Slide Images (The Lens)

Section titled “2A - Viewing Whole Slide Images (The Lens)”

OpenSlide (The WSI Reader)

OpenSlide is a C library with a Python wrapper that provides a simple interface to read Whole-Slide Images (WSIs). It abstracts proprietary formats (for example Aperio .svs, Hamamatsu .ndpi) and allows random access to pixel data at different resolution levels without loading the entire multi-gigabyte file into RAM.

This is your “Digital Microscope Stage.”
At a real microscope you do not view the entire glass slide at once—you move the stage to specific coordinates and switch objectives (4x, 10x, 40x). OpenSlide lets Python do the same: “Go to (x, y) and show cells at 40x.” It uses the image pyramid (stacked zoom levels) to load data instantly, like Google Earth.

Image of WSI image pyramid structure

  • Read: Open proprietary scanner formats (.svs, .tif, .ndpi, .mrxs) without vendor software.
  • Navigate: Jump to any coordinate on the slide.
  • Zoom: Extract image data at Level 0 (cells) or Level 3 (tissue architecture).
  • Thumbnail: Generate a low-res overview of the whole slide instantly.

Situations where it’s used (Medical Examples)

Section titled “Situations where it’s used (Medical Examples)”
  1. The “Low Power” Scan: Find tissue on the glass to avoid analyzing blank areas; request a thumbnail to create a tissue mask.
  2. The “High Power” Field: Extract a Level 0 high-res patch from a suspicious region for AI analysis.

Without OpenSlide, Python cannot open scanner files. Standard viewers (for example Photoshop) will crash on 40GB WSIs. OpenSlide bridges the medical device and your code.

OpenSlide needs both the Python package and the system binary.

Windows:

Terminal window
pip install openslide-python
# Note: Download Windows binaries and add them to PATH if needed.

Mac:

Terminal window
brew install openslide
pip3 install openslide-python

Linux:

Terminal window
sudo apt-get install python3-openslide
pip3 install openslide-python

The Situation: You have a raw .svs file. You need basic stats: dimensions and magnification.
The Solution: Open the file and read metadata headers.

import openslide
from pathlib import Path
# 1. Define the path to your slide
# TODO: Change to your actual file path
slide_path = Path("/path/to/project/data/raw_images/Case_001.svs")
# 2. Open the slide (connection only; no pixels loaded yet)
slide = openslide.OpenSlide(str(slide_path))
# 3. Read dimensions at Level 0 (full resolution)
w, h = slide.dimensions
print(f"Slide Dimensions: {w} x {h} pixels")
# 4. Read magnification (microns per pixel)
mpp = slide.properties.get("openslide.mpp-x")
print(f"Resolution: {mpp} microns per pixel")

The Situation: You want to check blur or pen marks before processing.
The Solution: Grab a small image from the top of the pyramid (zoomed out).

import matplotlib.pyplot as plt
# 1. Ask for a thumbnail (fits inside 512x512)
thumbnail = slide.get_thumbnail((512, 512))
# 2. Display it
plt.figure(figsize=(5, 5))
plt.imshow(thumbnail)
plt.title("Whole Slide Overview")
plt.axis("off")
plt.show()

Block C: The High-Power View (Virtual Microscopy)

Section titled “Block C: The High-Power View (Virtual Microscopy)”

The Situation: Simulate a 40x look at a tumor region.
The Solution: Use read_region with coordinates and zoom level to fetch a “virtual high-power field.”

# 1. Define coordinates (top-left in Level 0 pixels)
start_x = 15000
start_y = 24000
level = 0 # 0 = Max zoom (highest resolution)
size = (512, 512) # Width, height of the view
# 2. Extract the region
region = slide.read_region((start_x, start_y), level, size)
# 3. Convert to RGB (remove transparency)
region_rgb = region.convert("RGB")
# 4. Display
plt.figure(figsize=(6, 6))
plt.imshow(region_rgb)
plt.title(f"High Power View @ ({start_x}, {start_y})")
plt.axis("off")
plt.show()

Matplotlib (The Plotter) & Pillow (The Image Opener)

Matplotlib visualizes data arrays. Pillow (PIL) handles standard image formats (.jpg, .png). Here we use them to visualize the datasets created in Step 1 (the CSV of patches).

This is your “Multi-Head Microscope” or “Lightbox.” When a spreadsheet says “File_01 is Tumor,” you must verify it. This tool pulls random slides from your cohort and shows them in a grid so you can spot wrong labels or artifacts.

  • Sanity Check: Confirm the file paths in your CSV actually load.
  • Label Verification: Check if images labeled “Tumor” look like tumor.
  • Batch View: See 16, 25, or 50 images at once instead of opening them individually.

Situations where it’s used (Medical Examples)

Section titled “Situations where it’s used (Medical Examples)”
  • The “Label Flip”: Images labeled “Benign” look highly atypical; you discover labels were flipped.
  • The “Crazy Data”: A “Lung” dataset looks like “Liver”; you catch the error immediately.

Pathology is visual. Never trust a CSV blindly. Looking at images ties the math (spreadsheet) back to morphology (image) and prevents training on noise.

Run in terminal:

Terminal window
pip install matplotlib pillow pandas

The Situation: You have a CSV list of 3,200 images. You want 16 random ones to inspect.
The Solution: Use pandas to sample rows.

import pandas as pd
from pathlib import Path
# 1. Load your cohort (from Step 1)
df = pd.read_csv("metadata/master_cohort.csv")
# 2. Pick 16 random rows
sample_df = df.sample(n=16, random_state=42)
print("Selected 16 random images for review.")
print(sample_df.head())

The Situation: Display the sampled images with labels in a grid.
The Solution: Open with Pillow and plot with Matplotlib.

from PIL import Image
import matplotlib.pyplot as plt
import math
# Setup grid (4x4)
cols = 4
rows = math.ceil(len(sample_df) / cols)
plt.figure(figsize=(12, 12))
for i, (_, row) in enumerate(sample_df.iterrows()):
img_path = Path(row["full_path"]) # Ensure this matches your CSV column
label = row["diagnosis"]
try:
img = Image.open(img_path).convert("RGB")
plt.subplot(rows, cols, i + 1)
plt.imshow(img)
plt.title(label)
plt.axis("off")
except Exception as e:
print(f"Error opening {img_path}: {e}")
plt.tight_layout()
plt.show()