Initial System Setup
Initial System Setup: your lab computer and tools
Section titled “Initial System Setup: your lab computer and tools”This is where you get your “computational lab” ready: installing Python and a few core libraries, choosing a notebook or editor, and creating a clean project folder so every file has a sensible home. Most of the time you only need to do this once per machine, and then you can reuse the same setup for many projects.
Technical name: Initial System Setup
All-in-one Linux setup script (Ubuntu & Arch)
Section titled “All-in-one Linux setup script (Ubuntu & Arch)”This is a personal bootstrap script to get a fresh Linux machine ready for digital/computational pathology work.
It will:
- detect whether you’re on Ubuntu/Debian or an Arch-based distro (Arch, Manjaro, EndeavourOS)
- install:
- basic build tools
git,curl,wget- Docker
- Visual Studio Code
- Miniconda (Python + conda env manager)
It does not install NVIDIA drivers or CUDA (those are hardware-specific).
How to use this script (step by step)
Section titled “How to use this script (step by step)”Do this on the Linux machine you want to prepare (native Ubuntu/Arch or inside WSL2).
-
Open a terminal
- Ubuntu: “Terminal” from the app menu.
- Arch: your usual terminal emulator.
- WSL2: “Ubuntu” / your distro from the Start menu.
-
Go to your home directory
Terminal window cd ~ -
Create a new file for the script
Open it with a simple editor like
nano:Terminal window nano setup_dcp_env.shThis opens an empty file called
setup_dcp_env.sh. -
In your browser, select the entire script block (below), copy it:
#!/usr/bin/env bash# setup_dcp_env.sh - quick environment bootstrap for Ubuntu/Debian and Archset -euo pipefailecho "=== Digital / computational pathology environment bootstrap ==="if [ ! -f /etc/os-release ]; thenecho "Cannot detect Linux distribution (no /etc/os-release). Exiting."exit 1fi# shellcheck disable=SC1091. /etc/os-releaseDISTRO_ID="${ID:-unknown}"echo "Detected distro: ${DISTRO_ID}"install_ubuntu_like() {echo "Running Ubuntu/Debian setup..."sudo apt updatesudo apt install -y \build-essential \git \curl \wget \ca-certificates \software-properties-common# VS Code repowget -qO- https://packages.microsoft.com/keys/microsoft.asc | \gpg --dearmor | \sudo tee /usr/share/keyrings/packages.microsoft.gpg >/dev/nullecho "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/packages.microsoft.gpg] https://packages.microsoft.com/repos/code stable main" | \sudo tee /etc/apt/sources.list.d/vscode.list >/dev/nullsudo apt updatesudo apt install -y code# Docker (simple Ubuntu Docker package; adjust as needed)sudo apt install -y docker.iosudo systemctl enable --now dockersudo usermod -aG docker "$USER" || true}install_arch_like() {echo "Running Arch setup..."sudo pacman -Syu --noconfirmsudo pacman -S --noconfirm --needed \base-devel \git \curl \wget \ca-certificates \code \dockersudo systemctl enable --now dockersudo usermod -aG docker "$USER" || true}case "$DISTRO_ID" inubuntu|debian)install_ubuntu_like;;arch|manjaro|endeavouros)install_arch_like;;*)echo "This script only supports Ubuntu/Debian and Arch-like distros right now."echo "Detected ID='$DISTRO_ID'. Exiting."exit 1;;esac# Miniconda (user install)if [ ! -d "$HOME/miniconda3" ]; thenecho "Installing Miniconda into $HOME/miniconda3 ..."TMP_INSTALLER="/tmp/miniconda.sh"curl -fsSL https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o "$TMP_INSTALLER"bash "$TMP_INSTALLER" -b -p "$HOME/miniconda3"rm -f "$TMP_INSTALLER"if ! grep -q 'miniconda3' "$HOME/.bashrc" 2>/dev/null; thenecho 'export PATH="$HOME/miniconda3/bin:$PATH"' >> "$HOME/.bashrc"fielseecho "Miniconda directory already exists at $HOME/miniconda3, skipping."fiechoecho "Done."echo "- You may need to log out and back in for Docker group changes to take effect."echo "- Open a new shell so the Miniconda PATH in .bashrc is picked up." -
Go back to the terminal where
nanois open and paste- Click inside the terminal window with
nano. - Paste:
- Often right-click → Paste, or
Shift+Insertdepending on terminal.
You should now see the full script inside
nano. - Click inside the terminal window with
-
Save and exit the editor
In
nano:Ctrl+O→ Enter (save)Ctrl+X(exit)
-
Make the script executable
Terminal window chmod +x setup_dcp_env.sh -
Run the script
Terminal window ./setup_dcp_env.sh- Enter your password when
sudoprompts. - Let it finish; watch for obvious errors.
- Enter your password when
-
Log out and back in
- Log out of your session (or reboot) and log back in.
- This makes sure:
- your
dockergroup membership is active - your shell picks up the Miniconda
PATHadded to.bashrc
- your
-
Quick checks
Terminal window code --versiongit --versiondocker psconda --versionIf those print versions or basic output, your base environment is ready.
Quick guide: install Linux on a Windows PC (dual-boot)
Section titled “Quick guide: install Linux on a Windows PC (dual-boot)”If you prefer native Linux instead of WSL, here is the shortest safe path:
-
Prep & backup: confirm 30–60 GB free space and back up Windows data.
-
Download an ISO: Ubuntu LTS is the easiest first distro.
-
Create a bootable USB: use Rufus or balenaEtcher on Windows with an 8 GB+ USB stick.
-
Enter firmware (BIOS/UEFI): reboot and press your vendor key (F2/F10/Del).
- Disable Secure Boot if required.
- Enable virtualization (Intel VT-x / AMD-V) so WSL/VMs work properly.
- Set USB as the temporary boot device.
-
Install alongside Windows: boot from USB, pick “Install Ubuntu”, choose “Install alongside Windows” (the installer will shrink Windows safely), set user/timezone.
-
First boot on Ubuntu: run updates:
Terminal window sudo apt update && sudo apt upgrade -y
That leaves Windows intact while giving you a native Linux environment for CUDA, Docker, and imaging tools.
Initial System Setup Tools
Section titled “Initial System Setup Tools”Linux environment (native, WSL2, macOS terminal)
Section titled “Linux environment (native, WSL2, macOS terminal)”What it is
Section titled “What it is”The OS context where all your tools run. In practice:
- Native Linux (for example Ubuntu Desktop / Server)
- Windows with WSL2 (Linux userland inside Windows)
- macOS terminal (Unix-like, close enough for most user-level tasks)
What it is (Simplified)
Section titled “What it is (Simplified)”Think of this as the building where your lab lives.
- Windows, macOS, Linux are different buildings.
- Inside you place your equipment: Python, Docker, libvips, etc.
- Most hospital servers and research clusters use the Linux building.
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Almost all serious back-end systems and clusters you will touch are Linux.
- Most ML examples and GitHub repos assume Linux commands and paths.
- If you get comfortable in a Linux shell (native, WSL2, or macOS), working on hospital servers feels much less alien.
- For a dedicated GPU workstation, native Ubuntu keeps drivers and Docker simpler than Windows.
Quick installation / setup code
Section titled “Quick installation / setup code”On Windows (from an elevated PowerShell) to enable WSL2 with default distro:
wsl --installOn Ubuntu/Arch you normally install the whole OS from ISO; there is no single one-liner beyond boot + installer wizard. Use the official guides instead.
Official documentation
Section titled “Official documentation”- Ubuntu downloads: https://ubuntu.com/download
- Ubuntu install tutorials (desktop/server): https://ubuntu.com/tutorials/install-ubuntu-desktop
- WSL overview and install (Windows Subsystem for Linux): https://learn.microsoft.com/windows/wsl/install
- Ubuntu on WSL2 tutorial: https://ubuntu.com/tutorials/install-ubuntu-on-wsl2-on-windows-10
Visual Studio Code (VS Code)
Section titled “Visual Studio Code (VS Code)”What it is
Section titled “What it is”A cross-platform editor / lightweight IDE to edit code, configs, notes, and use Git, with a built-in terminal and powerful extensions.
What it is (Simplified)
Section titled “What it is (Simplified)”Think of VS Code as a good lab notebook and pen for code:
- One place to write scripts and notes
- See your project folders
- Run commands in a small terminal pane
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Same tool on Windows, macOS, Linux, and inside WSL2.
- Nice Git integration for committing and reviewing changes.
- Great Python support and Jupyter integration.
- With “Remote” extensions, you can edit files on a remote GPU server from your laptop.
Quick installation / setup code
Section titled “Quick installation / setup code”Normally you just download the installer from the website. On Ubuntu, after adding Microsoft’s repo (as the script does), you can:
sudo apt updatesudo apt install -y codeOn Arch (community repo):
sudo pacman -Syu --noconfirmsudo pacman -S --noconfirm codeAfter install:
- Extensions: “Python”, “Docker”, “Remote - SSH”, “GitHub Pull Requests”, “YAML”, “Quarto” (if writing notebooks).
- Enable autosave (
File -> Auto Save) and set a default formatter (Prettier/Black). - Use “Remote - SSH” to work on GPU servers without copying files locally.
Official documentation
Section titled “Official documentation”- VS Code download (all platforms): https://code.visualstudio.com/
- VS Code documentation / getting started: https://code.visualstudio.com/docs
Python + Miniconda / conda
Section titled “Python + Miniconda / conda”What it is
Section titled “What it is”- Python is the main programming language you will use for data handling, tiling, ML, and evaluation.
- Miniconda / conda is a package and environment manager that installs Python and keeps each project’s dependencies in its own environment.
What it is (Simplified)
Section titled “What it is (Simplified)”- Python is the language you talk to the computer in.
- conda is the medicine cabinet system that keeps each project’s drugs (packages) in its own labeled drawer, so they do not mix.
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Most modern pathology ML tooling (PyTorch, MONAI, scikit-learn, etc.) is Python-based.
- Separate environments prevent “I installed this for project A and broke project B”.
- An
environment.ymlorrequirements.txtmakes it easy to recreate a working setup months later or on another machine.
Quick installation / setup code
Section titled “Quick installation / setup code”After installing Miniconda from the official installer:
# create a project environmentconda create -n dcp-env python=3.11
# activate itconda activate dcp-env
# install some common packagesconda install numpy pandas matplotlib
# optional: add conda-forge for broader packagesconda config --add channels conda-forgeconda config --set channel_priority strictThe all-in-one script above already installs Miniconda into ~/miniconda3 and adds it to .bashrc.
Official documentation
Section titled “Official documentation”- Miniconda overview and downloads: https://docs.anaconda.com/miniconda/
- General conda documentation (getting started, managing environments): https://docs.conda.io/projects/conda/en/latest/user-guide/index.html
What it is
Section titled “What it is”A version control system that tracks changes to files in a project, lets you create named snapshots (“commits”), and roll back if needed.
What it is (Simplified)
Section titled “What it is (Simplified)”Git is a time machine for your project folder:
- Every save (commit) records what changed and a short message.
- You can later say “show me the project as it looked last Monday”.
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Undo bad changes without losing everything.
- See exactly what changed between two versions of a pipeline.
- Share code and configs with collaborators via GitHub/GitLab.
- Necessary if you want your computational pipelines to be reproducible and reviewable.
Quick installation / setup code
Section titled “Quick installation / setup code”On Ubuntu/Debian:
sudo apt updatesudo apt install -y gitOn Arch:
sudo pacman -Syu --noconfirmsudo pacman -S --noconfirm gitBasic first-time use in a project directory:
git initgit add .git commit -m "Initial snapshot"Configure identity and GitHub access:
git config --global user.name "Your Name"git config --global user.email "you@example.com"git config --global init.defaultBranch main # optional but recommended
# create an SSH keyssh-keygen -t ed25519 -C "you@example.com"eval "$(ssh-agent -s)"ssh-add ~/.ssh/id_ed25519
# copy the public key and paste into GitHub Settings > SSH and GPG keyscat ~/.ssh/id_ed25519.pub
# testssh -T git@github.comTips:
git statusto see changes;git log --onelineto review history.- Use GitHub CLI (
gh auth login) if you prefer HTTPS/pat flows. - Set
git config --global pull.rebase false(or true) depending on your workflow.
Official documentation
Section titled “Official documentation”- Git install page (all platforms): https://git-scm.com/downloads
- “Installing Git” in the Pro Git book: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
- GitHub “Set up Git” (nice walkthrough): https://docs.github.com/en/get-started/getting-started-with-git/set-up-git
Docker / containers
Section titled “Docker / containers”What it is
Section titled “What it is”Docker is a platform for building and running containers: packaged environments that include your code plus all required system and Python dependencies.
What it is (Simplified)
Section titled “What it is (Simplified)”A container is like a small, self-contained lab room in a box:
- Same tools and reagents wherever you ship it
- Less “it works on my machine, not on the server”
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Freeze a working environment so you can rerun or deploy it later without reinstalling everything.
- Give IT a concrete artifact (“run this container”) instead of a long “how to set up my pipeline” document.
- Aligns with how many hospital IT teams run internal services.
Quick installation / setup code
Section titled “Quick installation / setup code”On Ubuntu (simple case):
sudo apt updatesudo apt install -y docker.iosudo systemctl enable --now dockersudo usermod -aG docker "$USER" # then log out and back inOn Arch:
sudo pacman -Syu --noconfirmsudo pacman -S --noconfirm dockersudo systemctl enable --now dockersudo usermod -aG docker "$USER"Basic check:
docker run hello-worldIf Docker needs sudo, add yourself to the docker group and re-login:
sudo usermod -aG docker "$USER"newgrp dockerOfficial documentation
Section titled “Official documentation”- Docker Engine install docs (Linux): https://docs.docker.com/engine/install/
- Docker Desktop (for dev use on Windows/macOS/WSL): https://www.docker.com/products/docker-desktop
NVIDIA GPU & CUDA (high-level)
Section titled “NVIDIA GPU & CUDA (high-level)”What it is
Section titled “What it is”- An NVIDIA GPU is a hardware accelerator for heavy compute tasks (matrix multiplications, convolutions, etc.).
- CUDA Toolkit is NVIDIA’s software stack that ML libraries use to talk to the GPU.
What it is (Simplified)
Section titled “What it is (Simplified)”- A GPU is like a room full of many simple assistants who all do tiny calculations at the same time.
- CUDA is the instruction set and tools they understand.
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Cuts training time from days to hours for realistic WSI-scale models.
- Makes it feasible to run deep models on large cohorts on your own hardware.
- Allows more experimentation with model architecture and hyperparameters on real data, not just toy patches.
Quick installation / setup code
Section titled “Quick installation / setup code”This is hardware- and distro-specific. In practice you will:
- Install an appropriate NVIDIA driver for your card and OS.
- Install a CUDA Toolkit version compatible with your deep learning framework (or use a container image that bundles the right stack).
The exact commands change over time; rely on official docs and framework “get started” pages.
Typical sanity check once things are installed:
nvidia-smiand in Python:
import torchprint(torch.cuda.is_available())Official documentation
Section titled “Official documentation”- CUDA Toolkit docs and downloads: https://developer.nvidia.com/cuda-toolkit
- NVIDIA documentation hub: https://docs.nvidia.com/
SSH and remote development
Section titled “SSH and remote development”What it is
Section titled “What it is”- SSH (OpenSSH) is a protocol and toolset to securely log into another machine and run commands.
- VS Code Remote SSH lets you use VS Code to edit files on that remote machine.
What it is (Simplified)
Section titled “What it is (Simplified)”SSH is a secure hallway to another computer:
- you stay at your desk,
- but your commands run in the hospital server room where the slides and GPUs live.
Reasons a clinician would want to use it
Section titled “Reasons a clinician would want to use it”- Run code close to where WSIs and databases live instead of copying terabytes to your laptop.
- Use your familiar editor (VS Code) while the work runs on a powerful remote server.
- This is how IT will usually give you access to institutional compute resources.
Quick installation / setup code
Section titled “Quick installation / setup code”On most Linux/macOS systems, an SSH client is already installed.
On Ubuntu (if needed):
sudo apt updatesudo apt install -y openssh-clientOn Arch:
sudo pacman -Syu --noconfirmsudo pacman -S --noconfirm opensshBasic usage:
ssh yourname@your-hospital-serverFor VS Code Remote SSH, install the “Remote - SSH” extension from the VS Code marketplace and follow its first-run prompts.
Official documentation
Section titled “Official documentation”- OpenSSH project page: https://www.openssh.com/
- OpenSSH manual pages (e.g. ssh): https://man.openbsd.org/ssh
- Microsoft OpenSSH on Windows docs (if you ever need the server side there): https://learn.microsoft.com/windows-server/administration/openssh/openssh_overview
- VS Code Remote SSH documentation: https://code.visualstudio.com/docs/remote/ssh