Skip to content

Initial System Setup

Initial System Setup: your lab computer and tools

Section titled “Initial System Setup: your lab computer and tools”

This is where you get your “computational lab” ready: installing Python and a few core libraries, choosing a notebook or editor, and creating a clean project folder so every file has a sensible home. Most of the time you only need to do this once per machine, and then you can reuse the same setup for many projects.

Technical name: Initial System Setup

All-in-one Linux setup script (Ubuntu & Arch)

Section titled “All-in-one Linux setup script (Ubuntu & Arch)”

This is a personal bootstrap script to get a fresh Linux machine ready for digital/computational pathology work.

It will:

  • detect whether you’re on Ubuntu/Debian or an Arch-based distro (Arch, Manjaro, EndeavourOS)
  • install:
    • basic build tools
    • git, curl, wget
    • Docker
    • Visual Studio Code
    • Miniconda (Python + conda env manager)

It does not install NVIDIA drivers or CUDA (those are hardware-specific).


Do this on the Linux machine you want to prepare (native Ubuntu/Arch or inside WSL2).

  1. Open a terminal

    • Ubuntu: “Terminal” from the app menu.
    • Arch: your usual terminal emulator.
    • WSL2: “Ubuntu” / your distro from the Start menu.
  2. Go to your home directory

    Terminal window
    cd ~
  3. Create a new file for the script

    Open it with a simple editor like nano:

    Terminal window
    nano setup_dcp_env.sh

    This opens an empty file called setup_dcp_env.sh.

  4. In your browser, select the entire script block (below), copy it:

    #!/usr/bin/env bash
    # setup_dcp_env.sh - quick environment bootstrap for Ubuntu/Debian and Arch
    set -euo pipefail
    echo "=== Digital / computational pathology environment bootstrap ==="
    if [ ! -f /etc/os-release ]; then
    echo "Cannot detect Linux distribution (no /etc/os-release). Exiting."
    exit 1
    fi
    # shellcheck disable=SC1091
    . /etc/os-release
    DISTRO_ID="${ID:-unknown}"
    echo "Detected distro: ${DISTRO_ID}"
    install_ubuntu_like() {
    echo "Running Ubuntu/Debian setup..."
    sudo apt update
    sudo apt install -y \
    build-essential \
    git \
    curl \
    wget \
    ca-certificates \
    software-properties-common
    # VS Code repo
    wget -qO- https://packages.microsoft.com/keys/microsoft.asc | \
    gpg --dearmor | \
    sudo tee /usr/share/keyrings/packages.microsoft.gpg >/dev/null
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/packages.microsoft.gpg] https://packages.microsoft.com/repos/code stable main" | \
    sudo tee /etc/apt/sources.list.d/vscode.list >/dev/null
    sudo apt update
    sudo apt install -y code
    # Docker (simple Ubuntu Docker package; adjust as needed)
    sudo apt install -y docker.io
    sudo systemctl enable --now docker
    sudo usermod -aG docker "$USER" || true
    }
    install_arch_like() {
    echo "Running Arch setup..."
    sudo pacman -Syu --noconfirm
    sudo pacman -S --noconfirm --needed \
    base-devel \
    git \
    curl \
    wget \
    ca-certificates \
    code \
    docker
    sudo systemctl enable --now docker
    sudo usermod -aG docker "$USER" || true
    }
    case "$DISTRO_ID" in
    ubuntu|debian)
    install_ubuntu_like
    ;;
    arch|manjaro|endeavouros)
    install_arch_like
    ;;
    *)
    echo "This script only supports Ubuntu/Debian and Arch-like distros right now."
    echo "Detected ID='$DISTRO_ID'. Exiting."
    exit 1
    ;;
    esac
    # Miniconda (user install)
    if [ ! -d "$HOME/miniconda3" ]; then
    echo "Installing Miniconda into $HOME/miniconda3 ..."
    TMP_INSTALLER="/tmp/miniconda.sh"
    curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o "$TMP_INSTALLER"
    bash "$TMP_INSTALLER" -b -p "$HOME/miniconda3"
    rm -f "$TMP_INSTALLER"
    if ! grep -q 'miniconda3' "$HOME/.bashrc" 2>/dev/null; then
    echo 'export PATH="$HOME/miniconda3/bin:$PATH"' >> "$HOME/.bashrc"
    fi
    else
    echo "Miniconda directory already exists at $HOME/miniconda3, skipping."
    fi
    echo
    echo "Done."
    echo "- You may need to log out and back in for Docker group changes to take effect."
    echo "- Open a new shell so the Miniconda PATH in .bashrc is picked up."
  5. Go back to the terminal where nano is open and paste

    • Click inside the terminal window with nano.
    • Paste:
      • Often right-click → Paste, or
      • Shift+Insert depending on terminal.

    You should now see the full script inside nano.

  6. Save and exit the editor

    In nano:

    • Ctrl+O → Enter (save)
    • Ctrl+X (exit)
  7. Make the script executable

    Terminal window
    chmod +x setup_dcp_env.sh
  8. Run the script

    Terminal window
    ./setup_dcp_env.sh
    • Enter your password when sudo prompts.
    • Let it finish; watch for obvious errors.
  9. Log out and back in

    • Log out of your session (or reboot) and log back in.
    • This makes sure:
      • your docker group membership is active
      • your shell picks up the Miniconda PATH added to .bashrc
  10. Quick checks

    Terminal window
    code --version
    git --version
    docker ps
    conda --version

    If those print versions or basic output, your base environment is ready.

Quick guide: install Linux on a Windows PC (dual-boot)

Section titled “Quick guide: install Linux on a Windows PC (dual-boot)”

If you prefer native Linux instead of WSL, here is the shortest safe path:

  1. Prep & backup: confirm 30–60 GB free space and back up Windows data.

  2. Download an ISO: Ubuntu LTS is the easiest first distro.

  3. Create a bootable USB: use Rufus or balenaEtcher on Windows with an 8 GB+ USB stick.

  4. Enter firmware (BIOS/UEFI): reboot and press your vendor key (F2/F10/Del).

    • Disable Secure Boot if required.
    • Enable virtualization (Intel VT-x / AMD-V) so WSL/VMs work properly.
    • Set USB as the temporary boot device.
  5. Install alongside Windows: boot from USB, pick “Install Ubuntu”, choose “Install alongside Windows” (the installer will shrink Windows safely), set user/timezone.

  6. First boot on Ubuntu: run updates:

    Terminal window
    sudo apt update && sudo apt upgrade -y

That leaves Windows intact while giving you a native Linux environment for CUDA, Docker, and imaging tools.


Linux environment (native, WSL2, macOS terminal)

Section titled “Linux environment (native, WSL2, macOS terminal)”

The OS context where all your tools run. In practice:

  • Native Linux (for example Ubuntu Desktop / Server)
  • Windows with WSL2 (Linux userland inside Windows)
  • macOS terminal (Unix-like, close enough for most user-level tasks)

Think of this as the building where your lab lives.

  • Windows, macOS, Linux are different buildings.
  • Inside you place your equipment: Python, Docker, libvips, etc.
  • Most hospital servers and research clusters use the Linux building.
  • Almost all serious back-end systems and clusters you will touch are Linux.
  • Most ML examples and GitHub repos assume Linux commands and paths.
  • If you get comfortable in a Linux shell (native, WSL2, or macOS), working on hospital servers feels much less alien.
  • For a dedicated GPU workstation, native Ubuntu keeps drivers and Docker simpler than Windows.

On Windows (from an elevated PowerShell) to enable WSL2 with default distro:

Terminal window
wsl --install

On Ubuntu/Arch you normally install the whole OS from ISO; there is no single one-liner beyond boot + installer wizard. Use the official guides instead.


A cross-platform editor / lightweight IDE to edit code, configs, notes, and use Git, with a built-in terminal and powerful extensions.

Think of VS Code as a good lab notebook and pen for code:

  • One place to write scripts and notes
  • See your project folders
  • Run commands in a small terminal pane
  • Same tool on Windows, macOS, Linux, and inside WSL2.
  • Nice Git integration for committing and reviewing changes.
  • Great Python support and Jupyter integration.
  • With “Remote” extensions, you can edit files on a remote GPU server from your laptop.

Normally you just download the installer from the website. On Ubuntu, after adding Microsoft’s repo (as the script does), you can:

Terminal window
sudo apt update
sudo apt install -y code

On Arch (community repo):

Terminal window
sudo pacman -Syu --noconfirm
sudo pacman -S --noconfirm code

After install:

  • Extensions: “Python”, “Docker”, “Remote - SSH”, “GitHub Pull Requests”, “YAML”, “Quarto” (if writing notebooks).
  • Enable autosave (File -> Auto Save) and set a default formatter (Prettier/Black).
  • Use “Remote - SSH” to work on GPU servers without copying files locally.

  • Python is the main programming language you will use for data handling, tiling, ML, and evaluation.
  • Miniconda / conda is a package and environment manager that installs Python and keeps each project’s dependencies in its own environment.
  • Python is the language you talk to the computer in.
  • conda is the medicine cabinet system that keeps each project’s drugs (packages) in its own labeled drawer, so they do not mix.
  • Most modern pathology ML tooling (PyTorch, MONAI, scikit-learn, etc.) is Python-based.
  • Separate environments prevent “I installed this for project A and broke project B”.
  • An environment.yml or requirements.txt makes it easy to recreate a working setup months later or on another machine.

After installing Miniconda from the official installer:

Terminal window
# create a project environment
conda create -n dcp-env python=3.11
# activate it
conda activate dcp-env
# install some common packages
conda install numpy pandas matplotlib
# optional: add conda-forge for broader packages
conda config --add channels conda-forge
conda config --set channel_priority strict

The all-in-one script above already installs Miniconda into ~/miniconda3 and adds it to .bashrc.


A version control system that tracks changes to files in a project, lets you create named snapshots (“commits”), and roll back if needed.

Git is a time machine for your project folder:

  • Every save (commit) records what changed and a short message.
  • You can later say “show me the project as it looked last Monday”.
  • Undo bad changes without losing everything.
  • See exactly what changed between two versions of a pipeline.
  • Share code and configs with collaborators via GitHub/GitLab.
  • Necessary if you want your computational pipelines to be reproducible and reviewable.

On Ubuntu/Debian:

Terminal window
sudo apt update
sudo apt install -y git

On Arch:

Terminal window
sudo pacman -Syu --noconfirm
sudo pacman -S --noconfirm git

Basic first-time use in a project directory:

Terminal window
git init
git add .
git commit -m "Initial snapshot"

Configure identity and GitHub access:

Terminal window
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
git config --global init.defaultBranch main # optional but recommended
# create an SSH key
ssh-keygen -t ed25519 -C "you@example.com"
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
# copy the public key and paste into GitHub Settings > SSH and GPG keys
cat ~/.ssh/id_ed25519.pub
# test
ssh -T git@github.com

Tips:

  • git status to see changes; git log --oneline to review history.
  • Use GitHub CLI (gh auth login) if you prefer HTTPS/pat flows.
  • Set git config --global pull.rebase false (or true) depending on your workflow.

Docker is a platform for building and running containers: packaged environments that include your code plus all required system and Python dependencies.

A container is like a small, self-contained lab room in a box:

  • Same tools and reagents wherever you ship it
  • Less “it works on my machine, not on the server”
  • Freeze a working environment so you can rerun or deploy it later without reinstalling everything.
  • Give IT a concrete artifact (“run this container”) instead of a long “how to set up my pipeline” document.
  • Aligns with how many hospital IT teams run internal services.

On Ubuntu (simple case):

Terminal window
sudo apt update
sudo apt install -y docker.io
sudo systemctl enable --now docker
sudo usermod -aG docker "$USER" # then log out and back in

On Arch:

Terminal window
sudo pacman -Syu --noconfirm
sudo pacman -S --noconfirm docker
sudo systemctl enable --now docker
sudo usermod -aG docker "$USER"

Basic check:

Terminal window
docker run hello-world

If Docker needs sudo, add yourself to the docker group and re-login:

Terminal window
sudo usermod -aG docker "$USER"
newgrp docker

  • An NVIDIA GPU is a hardware accelerator for heavy compute tasks (matrix multiplications, convolutions, etc.).
  • CUDA Toolkit is NVIDIA’s software stack that ML libraries use to talk to the GPU.
  • A GPU is like a room full of many simple assistants who all do tiny calculations at the same time.
  • CUDA is the instruction set and tools they understand.
  • Cuts training time from days to hours for realistic WSI-scale models.
  • Makes it feasible to run deep models on large cohorts on your own hardware.
  • Allows more experimentation with model architecture and hyperparameters on real data, not just toy patches.

This is hardware- and distro-specific. In practice you will:

  • Install an appropriate NVIDIA driver for your card and OS.
  • Install a CUDA Toolkit version compatible with your deep learning framework (or use a container image that bundles the right stack).

The exact commands change over time; rely on official docs and framework “get started” pages.

Typical sanity check once things are installed:

Terminal window
nvidia-smi

and in Python:

import torch
print(torch.cuda.is_available())

  • SSH (OpenSSH) is a protocol and toolset to securely log into another machine and run commands.
  • VS Code Remote SSH lets you use VS Code to edit files on that remote machine.

SSH is a secure hallway to another computer:

  • you stay at your desk,
  • but your commands run in the hospital server room where the slides and GPUs live.
  • Run code close to where WSIs and databases live instead of copying terabytes to your laptop.
  • Use your familiar editor (VS Code) while the work runs on a powerful remote server.
  • This is how IT will usually give you access to institutional compute resources.

On most Linux/macOS systems, an SSH client is already installed.

On Ubuntu (if needed):

Terminal window
sudo apt update
sudo apt install -y openssh-client

On Arch:

Terminal window
sudo pacman -Syu --noconfirm
sudo pacman -S --noconfirm openssh

Basic usage:

Terminal window
ssh yourname@your-hospital-server

For VS Code Remote SSH, install the “Remote - SSH” extension from the VS Code marketplace and follow its first-run prompts.