Step 2 – Slides & Viewing (The Digital Microscope)
2 - Slides & Viewing (The Digital Microscope)
Section titled “2 - Slides & Viewing (The Digital Microscope)”2A - Viewing Whole Slide Images (The Lens)
Section titled “2A - Viewing Whole Slide Images (The Lens)”Name of Tool
Section titled “Name of Tool”OpenSlide (The WSI Reader)
Technical Explanation
Section titled “Technical Explanation”OpenSlide is a C library with a Python wrapper that provides a simple interface to read Whole-Slide Images (WSIs). It abstracts proprietary formats (for example Aperio .svs, Hamamatsu .ndpi) and allows random access to pixel data at different resolution levels without loading the entire multi-gigabyte file into RAM.
Simplified Explanation
Section titled “Simplified Explanation”This is your “Digital Microscope Stage.”
At a real microscope you do not view the entire glass slide at once—you move the stage to specific coordinates and switch objectives (4x, 10x, 40x). OpenSlide lets Python do the same: “Go to (x, y) and show cells at 40x.” It uses the image pyramid (stacked zoom levels) to load data instantly, like Google Earth.
Image of WSI image pyramid structure
What can it do?
Section titled “What can it do?”- Read: Open proprietary scanner formats (
.svs,.tif,.ndpi,.mrxs) without vendor software. - Navigate: Jump to any coordinate on the slide.
- Zoom: Extract image data at Level 0 (cells) or Level 3 (tissue architecture).
- Thumbnail: Generate a low-res overview of the whole slide instantly.
Situations where it’s used (Medical Examples)
Section titled “Situations where it’s used (Medical Examples)”- The “Low Power” Scan: Find tissue on the glass to avoid analyzing blank areas; request a thumbnail to create a tissue mask.
- The “High Power” Field: Extract a Level 0 high-res patch from a suspicious region for AI analysis.
Why it’s important to pathologists
Section titled “Why it’s important to pathologists”Without OpenSlide, Python cannot open scanner files. Standard viewers (for example Photoshop) will crash on 40GB WSIs. OpenSlide bridges the medical device and your code.
Installation Instructions
Section titled “Installation Instructions”OpenSlide needs both the Python package and the system binary.
Windows:
pip install openslide-python# Note: Download Windows binaries and add them to PATH if needed.Mac:
brew install openslidepip3 install openslide-pythonLinux:
sudo apt-get install python3-openslidepip3 install openslide-pythonLego Building Blocks (Code)
Section titled “Lego Building Blocks (Code)”Block A: The Unboxing (Opening the Slide)
Section titled “Block A: The Unboxing (Opening the Slide)”The Situation: You have a raw .svs file. You need basic stats: dimensions and magnification.
The Solution: Open the file and read metadata headers.
import openslidefrom pathlib import Path
# 1. Define the path to your slide# TODO: Change to your actual file pathslide_path = Path("/path/to/project/data/raw_images/Case_001.svs")
# 2. Open the slide (connection only; no pixels loaded yet)slide = openslide.OpenSlide(str(slide_path))
# 3. Read dimensions at Level 0 (full resolution)w, h = slide.dimensionsprint(f"Slide Dimensions: {w} x {h} pixels")
# 4. Read magnification (microns per pixel)mpp = slide.properties.get("openslide.mpp-x")print(f"Resolution: {mpp} microns per pixel")Block B: The Low-Power View (Thumbnail)
Section titled “Block B: The Low-Power View (Thumbnail)”The Situation: You want to check blur or pen marks before processing.
The Solution: Grab a small image from the top of the pyramid (zoomed out).
import matplotlib.pyplot as plt
# 1. Ask for a thumbnail (fits inside 512x512)thumbnail = slide.get_thumbnail((512, 512))
# 2. Display itplt.figure(figsize=(5, 5))plt.imshow(thumbnail)plt.title("Whole Slide Overview")plt.axis("off")plt.show()Block C: The High-Power View (Virtual Microscopy)
Section titled “Block C: The High-Power View (Virtual Microscopy)”The Situation: Simulate a 40x look at a tumor region.
The Solution: Use read_region with coordinates and zoom level to fetch a “virtual high-power field.”
# 1. Define coordinates (top-left in Level 0 pixels)start_x = 15000start_y = 24000level = 0 # 0 = Max zoom (highest resolution)size = (512, 512) # Width, height of the view
# 2. Extract the regionregion = slide.read_region((start_x, start_y), level, size)
# 3. Convert to RGB (remove transparency)region_rgb = region.convert("RGB")
# 4. Displayplt.figure(figsize=(6, 6))plt.imshow(region_rgb)plt.title(f"High Power View @ ({start_x}, {start_y})")plt.axis("off")plt.show()Resource Sites
Section titled “Resource Sites”- OpenSlide Python API docs: https://openslide.org/api/python/
- OpenSlide home: https://openslide.org/
2B - Viewing Patches (The Lightbox)
Section titled “2B - Viewing Patches (The Lightbox)”Name of Tool
Section titled “Name of Tool”Matplotlib (The Plotter) & Pillow (The Image Opener)
Technical Explanation
Section titled “Technical Explanation”Matplotlib visualizes data arrays. Pillow (PIL) handles standard image formats (.jpg, .png). Here we use them to visualize the datasets created in Step 1 (the CSV of patches).
Simplified Explanation
Section titled “Simplified Explanation”This is your “Multi-Head Microscope” or “Lightbox.” When a spreadsheet says “File_01 is Tumor,” you must verify it. This tool pulls random slides from your cohort and shows them in a grid so you can spot wrong labels or artifacts.
What can it do?
Section titled “What can it do?”- Sanity Check: Confirm the file paths in your CSV actually load.
- Label Verification: Check if images labeled “Tumor” look like tumor.
- Batch View: See 16, 25, or 50 images at once instead of opening them individually.
Situations where it’s used (Medical Examples)
Section titled “Situations where it’s used (Medical Examples)”- The “Label Flip”: Images labeled “Benign” look highly atypical; you discover labels were flipped.
- The “Crazy Data”: A “Lung” dataset looks like “Liver”; you catch the error immediately.
Why it’s important to pathologists
Section titled “Why it’s important to pathologists”Pathology is visual. Never trust a CSV blindly. Looking at images ties the math (spreadsheet) back to morphology (image) and prevents training on noise.
Installation Instructions
Section titled “Installation Instructions”Run in terminal:
pip install matplotlib pillow pandasLego Building Blocks (Code)
Section titled “Lego Building Blocks (Code)”Block A: Random Sampling (The Tray)
Section titled “Block A: Random Sampling (The Tray)”The Situation: You have a CSV list of 3,200 images. You want 16 random ones to inspect.
The Solution: Use pandas to sample rows.
import pandas as pdfrom pathlib import Path
# 1. Load your cohort (from Step 1)df = pd.read_csv("metadata/master_cohort.csv")
# 2. Pick 16 random rowssample_df = df.sample(n=16, random_state=42)
print("Selected 16 random images for review.")print(sample_df.head())Block B: The Grid View (The Microscope)
Section titled “Block B: The Grid View (The Microscope)”The Situation: Display the sampled images with labels in a grid.
The Solution: Open with Pillow and plot with Matplotlib.
from PIL import Imageimport matplotlib.pyplot as pltimport math
# Setup grid (4x4)cols = 4rows = math.ceil(len(sample_df) / cols)
plt.figure(figsize=(12, 12))
for i, (_, row) in enumerate(sample_df.iterrows()): img_path = Path(row["full_path"]) # Ensure this matches your CSV column label = row["diagnosis"]
try: img = Image.open(img_path).convert("RGB") plt.subplot(rows, cols, i + 1) plt.imshow(img) plt.title(label) plt.axis("off") except Exception as e: print(f"Error opening {img_path}: {e}")
plt.tight_layout()plt.show()Resource Sites
Section titled “Resource Sites”- Pillow (PIL) handbook: https://pillow.readthedocs.io/en/stable/handbook/tutorial.html
- Matplotlib gallery: https://matplotlib.org/stable/gallery/index.html
- OpenSlide Python API: https://openslide.org/api/python/