Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
315 changes: 315 additions & 0 deletions src/python/examples/segmentation_workbook/cpsam_segmentation.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "07ed918f",
"metadata": {},
"source": [
"# CellposeSAM segmentation across an AVITI24 cytoprofiling run\n",
"\n",
"Run [CellposeSAM (CPSAM)](https://github.com/mouseland/cellpose) across every tile in a Teton or Teton Atlas run and write the segmentation masks Cells2Stats expects. CPSAM is a general-purpose model that integrates the [Segment Anything Model](https://segment-anything.com/) architecture; use it when your cell type is not represented in the Element Biosciences model library or when the General Element Biosciences model produces poor results even after diameter tuning.\n",
"\n",
"This notebook is the companion to the [Custom segmentation tutorial](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/). Read the tutorial first for the full context: run-type identification, when to use CPSAM, and post-segmentation Cells2Stats re-run."
]
},
{
"cell_type": "markdown",
"id": "6fd24dd9",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"Confirm the following before running this notebook:\n",
"\n",
"- Your run is a **Teton** or **Teton Atlas** run. CPSAM requires the actin channel and fails on Cell Paint only runs because no actin `.tif` file exists. For Cell Paint only runs, use an Element Biosciences 2-channel model and follow [Run a tile evaluation](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/#tile-evaluation) and [Run full segmentation](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/#full-segmentation).\n",
"- You created the separate `cpsam` Python environment with Cellpose 4.x installed from the MouseLand GitHub HEAD. See [Set up the CPSAM environment](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/#cpsam-setup-env). Do not install Cellpose 4.x into your `cytoprofiling-seg` environment.\n",
"- The `cpsam` environment is selected as this notebook's kernel (top menu: **Kernel → Change kernel → CellposeSAM**).\n",
"- A GPU is available. CPSAM on CPU is prohibitively slow for full-run processing.\n",
"\n",
"> **First run downloads ~1.15 GB.** The first time you initialize the CPSAM model, Cellpose automatically downloads the model weights (~1.15 GB) from HuggingFace to `~/.cellpose/models/`. Subsequent runs use the cached weights and do not require an internet connection."
]
},
{
"cell_type": "markdown",
"id": "06e16ef5",
"metadata": {},
"source": [
"## Step 1 — Import packages\n",
"\n",
"Load the imaging, numerics, and Cellpose packages used throughout the rest of the notebook."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "48e24551",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"\n",
"import numpy as np\n",
"import skimage\n",
"from cellpose import core, models, transforms\n"
]
},
{
"cell_type": "markdown",
"id": "9ce6d73e",
"metadata": {},
"source": [
"## Step 2 — Provide Input and Output Paths\n",
"\n",
"Set the two required paths. Use a fresh `output_location` per re-segmentation pass so CPSAM masks do not overwrite Element Biosciences masks."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "24ab308f",
"metadata": {},
"outputs": [],
"source": [
"# Edit both paths before running the rest of the notebook.\n",
"\n",
"# Path to your AVITI24 run output folder\n",
"run_directory = r\"/path/to/your/Run/Output/Folder\"\n",
"\n",
"# Where to write the CPSAM segmentation mask outputs (must be a different folder)\n",
"output_location = r\"/path/to/your/Run/Output/Folder/Segmentation_Output\""
]
},
{
"cell_type": "markdown",
"id": "65c4954e",
"metadata": {},
"source": [
"## Step 3 — Confirm GPU and load CPSAM\n",
"\n",
"Verify that a GPU is available, then load the CPSAM model. The first time this cell runs, Cellpose downloads ~1.15 GB of model weights from HuggingFace to `~/.cellpose/models/`. Subsequent runs use the cached weights."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b567dd74",
"metadata": {},
"outputs": [],
"source": [
"if not core.use_gpu():\n",
" print(\"WARNING: No GPU detected. Runtime will be very long (8+ hours for 12-well).\")\n",
" print(\"Consider running on a GPU-equipped machine for production use.\")\n",
"else:\n",
" print(\"GPU confirmed. Proceeding with CPSAM segmentation.\")\n",
"\n",
"# Load model. Downloads on first run (~1.15 GB).\n",
"model = models.CellposeModel(gpu=True)\n"
]
},
{
"cell_type": "markdown",
"id": "32c8be23",
"metadata": {},
"source": [
"## Step 4 — Define the normalization helper\n",
"\n",
"`normalize_image` applies Cellpose's per-region normalization across 1824-pixel sub-tiles, matching the preprocessing used by the Element Biosciences segmentation workflow."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "8b14c6cf",
"metadata": {},
"outputs": [],
"source": [
"def normalize_image(image, region_size=1824):\n",
" image_norm = np.zeros_like(image, np.single)\n",
"\n",
" for xi in range(int(image.shape[1] / region_size)):\n",
" for yi in range(int(image.shape[0] / region_size)):\n",
" cropped = image[\n",
" yi * region_size:(yi + 1) * region_size,\n",
" xi * region_size:(xi + 1) * region_size,\n",
" ]\n",
" cropped = transforms.normalize_img(\n",
" cropped.reshape(cropped.shape[0], cropped.shape[1], 1)\n",
" ).reshape(cropped.shape[0], cropped.shape[1])\n",
" image_norm[\n",
" yi * region_size:(yi + 1) * region_size,\n",
" xi * region_size:(xi + 1) * region_size,\n",
" ] = cropped\n",
"\n",
" return image_norm\n"
]
},
{
"cell_type": "markdown",
"id": "18fa08aa",
"metadata": {},
"source": [
"## Step 5 — Build the tile list from `RunParameters.json`\n",
"\n",
"Read `RunParameters.json` to enumerate every well and tile in the run and build the `tile2well` map used by the segmentation loop. The cell prints the total tile count so you can confirm the workload before committing to Step 6."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4e628cb",
"metadata": {},
"outputs": [],
"source": [
"with open(os.path.join(run_directory, \"RunParameters.json\")) as f:\n",
" run_parameters = json.load(f)\n",
"\n",
"tile2well, tiles = {}, []\n",
"for well in run_parameters[\"Wells\"]:\n",
" for tile in well[\"Tiles\"]:\n",
" tile2well[tile[\"Name\"]] = well[\"WellLocation\"]\n",
" tiles.append(tile[\"Name\"])\n",
"\n",
"print(f\"Total tiles to process: {len(tiles)}\")\n"
]
},
{
"cell_type": "markdown",
"id": "39ec5de5",
"metadata": {},
"source": [
"## Step 6 — Segment every tile and write masks\n",
"\n",
"Run CPSAM on each tile and write the cell and nuclear masks to `output_location/Well{well}/`. Unlike the Element Biosciences workflow, CPSAM uses a single model for every well, so no per-well model lookup is needed.\n",
"\n",
"To monitor progress, watch for the rolling `Done: ...` lines. Each line corresponds to one tile fully processed and saved. See the **Runtime expectations** table at the bottom of the notebook for typical wall-clock times."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "733015fe",
"metadata": {},
"outputs": [],
"source": [
"flow_threshold = 0.4\n",
"cellprob_threshold = 0.0\n",
"tile_norm_blocksize = 0\n",
"\n",
"print(f\"Beginning segmentation across {len(tiles)} tiles\")\n",
"\n",
"for tile in tiles:\n",
" well = tile2well[tile]\n",
" os.makedirs(os.path.join(output_location, f\"Well{well}\"), exist_ok=True)\n",
"\n",
" cell_image = skimage.io.imread(\n",
" os.path.join(run_directory, \"Projection\", f\"Well{well}\", f\"CP01_{tile}_Cell-Membrane.tif\")\n",
" )\n",
" nuclear_image = skimage.io.imread(\n",
" os.path.join(run_directory, \"Projection\", f\"Well{well}\", f\"CP01_{tile}_Nucleus.tif\")\n",
" )\n",
" actin_image = skimage.io.imread(\n",
" os.path.join(run_directory, \"Projection\", f\"Well{well}\", f\"CP01_{tile}_Actin.tif\")\n",
" )\n",
"\n",
" cell_image = normalize_image(cell_image)\n",
" nuclear_image = normalize_image(nuclear_image)\n",
" actin_image = normalize_image(actin_image)\n",
"\n",
" composite = np.zeros((cell_image.shape[0], cell_image.shape[1], 3))\n",
" composite[:, :, 0] = cell_image\n",
" composite[:, :, 1] = nuclear_image\n",
" composite[:, :, 2] = actin_image\n",
"\n",
" print(f\"Segmenting cell membrane \\u2014 tile: {tile}\")\n",
" cell_mask, _, _ = model.eval(\n",
" composite,\n",
" batch_size=2,\n",
" flow_threshold=flow_threshold,\n",
" cellprob_threshold=cellprob_threshold,\n",
" normalize={\"tile_norm_blocksize\": tile_norm_blocksize},\n",
" resample=False,\n",
" )\n",
" cell_mask = cell_mask.astype(np.uint32)\n",
"\n",
" print(f\"Segmenting nuclei \\u2014 tile: {tile}\")\n",
" nuclear_mask, _, _ = model.eval(\n",
" nuclear_image,\n",
" batch_size=2,\n",
" flow_threshold=flow_threshold,\n",
" cellprob_threshold=cellprob_threshold,\n",
" normalize={\"tile_norm_blocksize\": tile_norm_blocksize},\n",
" resample=False,\n",
" )\n",
" binary_nuclei = nuclear_mask.copy()\n",
" binary_nuclei[nuclear_mask > 0] = 1\n",
"\n",
" skimage.io.imsave(\n",
" os.path.join(output_location, f\"Well{well}\", f\"{tile}_Cell.tif\"),\n",
" cell_mask.astype(np.uint16),\n",
" )\n",
" skimage.io.imsave(\n",
" os.path.join(output_location, f\"Well{well}\", f\"{tile}_Nuclear.tif\"),\n",
" binary_nuclei.astype(np.uint8),\n",
" )\n",
" print(f\"Done: {tile}\")\n"
]
},
{
"cell_type": "markdown",
"id": "8dba8d14",
"metadata": {},
"source": [
"## Reference\n",
"\n",
"### Runtime expectations\n",
"\n",
"CPSAM processes every tile in the run. CPU runtimes are prohibitive; the table below assumes a GPU.\n",
"\n",
"| Plate format | Approximate tiles | GPU estimate |\n",
"| ------------ | ----------------- | ------------ |\n",
"| 1-well | ~18 tiles | ~30 minutes |\n",
"| 12-well | ~216 tiles | ~6–10 hours |\n",
"| 48-well | ~864 tiles | ~24–36 hours |\n",
"\n",
"### Output files\n",
"\n",
"For each tile, the loop writes two files to your `output_location`:\n",
"\n",
"- `{tile}_Cell.tif`: a `uint16` label mask where each unique integer represents one segmented cell.\n",
"- `{tile}_Nuclear.tif`: a `uint8` binary mask where `0` indicates no nucleus and `1` indicates a nucleus is present.\n",
"\n",
"### Validation\n",
"\n",
"CPSAM does not produce a built-in quality metrics table. Verify outputs visually or by comparing cell and nucleus counts against the baseline you established in [Interpret results and choose a model](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/#interpret-results).\n",
"\n",
"### Third-party tool disclaimer\n",
"\n",
"CellposeSAM is provided by the [MouseLand open-source project](https://github.com/mouseland/cellpose) and is not affiliated with or endorsed by Element Biosciences. CPSAM has not been formally validated against AVITI24 cytoprofiling runs and results may vary. For CPSAM-specific issues, installation support, or model updates, refer to the [official MouseLand repository](https://github.com/mouseland/cellpose).\n",
"\n",
"After all tiles finish, re-run Cells2Stats with `--segmentation` pointing at `output_location` to regenerate the cell table. See [Re-run Cells2Stats for cell assignment](https://docs.elembio.io/docs/tutorials/cytoprofiling/custom-segmentation/#cell-assignment)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "CellposeSAM",
"language": "python",
"name": "cpsam"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading