Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
5dc2f9c
update tool.ruff target-version to py311
fujikosu Mar 17, 2025
313d1c7
migrate notebook container to uv
fujikosu Mar 17, 2025
94e043d
migrate sample_cpu_project to uv
fujikosu Mar 17, 2025
e9a0261
fix comment in Dockerfile
fujikosu Mar 17, 2025
586536a
remove unused requirements.txt
fujikosu Mar 17, 2025
f8b9686
migrate gpu container to uv
fujikosu Mar 17, 2025
34ded4d
use uv in CI
fujikosu Mar 17, 2025
eb44d7a
add cuda version check in sample. switch base container from cuda to …
fujikosu Mar 18, 2025
2a2005c
update dockerfiles to use UID 1000
fujikosu Mar 26, 2025
205caae
add uv support to dependabot
fujikosu Mar 28, 2025
4e12d15
add uv related docs
fujikosu Mar 28, 2025
0da499b
add .DS_Store to gitignore
fujikosu Mar 28, 2025
befd97d
update docs to uv version
fujikosu Mar 28, 2025
c2ad412
tested uv install commands and updated readme
fujikosu Mar 28, 2025
9c46482
Merge branch 'main' into feat/migrate-to-uv-from-pip
fujikosu Apr 13, 2025
e927088
update uv lock procedure doc
fujikosu Apr 13, 2025
b24c6c7
update azdo pipelines
fujikosu Apr 13, 2025
246331d
fix permission issue for certifi
fujikosu Apr 13, 2025
8048cf8
permission fix
fujikosu Apr 13, 2025
bff3a98
get rid of cuda 12.6 setting to default back to 12.4
fujikosu Apr 13, 2025
0a73e70
updated lockfile
fujikosu Apr 13, 2025
7523729
update README to include more mentions about uv and PR template
fujikosu Apr 13, 2025
3e38a95
update Dockerfiles to change UV_LINK_MODE and make use of mounted cache
fujikosu Jul 19, 2025
a2b10fb
fix the mount path in gpu container
fujikosu Jul 19, 2025
5de5a74
Merge branch 'main' into feat/migrate-to-uv-from-pip
fujikosu Apr 6, 2026
812e2e0
build(docker): update Dockerfiles to allow non-root user package mana…
fujikosu Apr 6, 2026
5b9a57b
Update README and environment configuration for Azure ML
fujikosu Apr 8, 2026
d6e575b
build(deps): remove pip updates from Dependabot configuration
fujikosu Apr 8, 2026
c499913
build(pipeline): update README for correct uv command for removing ml…
fujikosu Apr 10, 2026
af5e904
build(docs): update README and devcontainer configurations for uv mig…
fujikosu Apr 16, 2026
5cd11bb
build(docs): update README and Docker configurations for UV integration
fujikosu Apr 16, 2026
86ac406
build(docker): remove .dockerignore file for UV migration
fujikosu Apr 16, 2026
8a4a6b4
build(docs): add Copilot instructions and update Python conventions f…
fujikosu Apr 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 5 additions & 8 deletions .azuredevops/ado-ci-pipeline-ms-hosted.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,14 @@ steps:
inputs:
versionSpec: 3.11

# Using pip here instead of uv/uvx because Azure DevOps doesn't have native uv support
# (unlike GitHub Actions which has astral-sh/setup-uv), and installing uv just to run
# these tools would add unnecessary overhead.
- script: |
python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
pip install pytest-azurepipelines
displayName: "Install requirements"
pip install ruff pytest-azurepipelines
displayName: "Install ruff and pytest-azurepipelines"

# files under venv will be automatically excluded from ruff check by default https://docs.astral.sh/ruff/settings/#exclude
- bash: |
source venv/bin/activate
ruff check --output-format azure
displayName: "Run ruff linter"

Expand Down
13 changes: 5 additions & 8 deletions .azuredevops/ado-ci-pipeline-self-hosted.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,23 +35,20 @@ steps:
inputs:
versionSpec: 3.11

# Using pip here instead of uv/uvx because Azure DevOps doesn't have native uv support
# (unlike GitHub Actions which has astral-sh/setup-uv), and installing uv just to run
# these tools would add unnecessary overhead.
- script: |
python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
pip install pytest-azurepipelines
displayName: "Install requirements"
pip install ruff pytest-azurepipelines
displayName: "Install ruff and pytest-azurepipelines"

- task: UseDotNet@2
inputs:
packageType: 'sdk'
workingDirectory: "src/"
version: '6.x'

# files under venv will be automatically excluded from ruff check by default https://docs.astral.sh/ruff/settings/#exclude
- bash: |
source venv/bin/activate
ruff check --output-format azure
displayName: "Run ruff linter"

Expand Down
2 changes: 1 addition & 1 deletion .azuredevops/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

* [ ] No PII in logs or output
* [ ] Made corresponding changes to the documentation
* [ ] All new packages used are included in requirements.txt
* [ ] All new packages used are included in pyproject.toml
* [ ] Functions use type hints, and there are no type hint errors

## Pull Request Type
Expand Down
17 changes: 0 additions & 17 deletions .dockerignore

This file was deleted.

6 changes: 6 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Copilot Instructions

## Python Conventions

- Use Google-style docstrings for all Python functions, classes, and modules.
- Use `uv` for dependency management.
4 changes: 2 additions & 2 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@

version: 2
updates:
- package-ecosystem: "pip"
- package-ecosystem: "uv"
directories:
- "**/*"
schedule:
interval: "monthly"
groups:
pip-minor-patch-updates:
uv-minor-patch-updates:
applies-to: version-updates
update-types:
- "minor"
Expand Down
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

* [ ] No PII in logs or output
* [ ] Made corresponding changes to the documentation
* [ ] All new packages used are included in requirements.txt
* [ ] All new packages used are included in pyproject.toml
* [ ] Functions use type hints, and there are no type hint errors

## Pull Request Type
Expand Down
17 changes: 3 additions & 14 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,23 +21,12 @@ jobs:
- name: Checkout code
uses: actions/checkout@v6

- name: Setup Python 3.11
uses: actions/setup-python@v6
with:
python-version: 3.11

- name: Install requirements
run: |
python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements-dev.txt
- name: Install uv
uses: astral-sh/setup-uv@v5

- name: Run ruff linter
# files under venv will be automatically excluded from ruff check by default https://docs.astral.sh/ruff/settings/#exclude
run: |
source venv/bin/activate
ruff check --output-format github
uvx ruff check --output-format github

- name: Run pytest in docker containers
run: ./ci-tests.sh
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
data
logs
.vscode
.DS_Store

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
43 changes: 31 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Dev Containers for ML feasibility study with VS Code

A machine learning and data science project template that makes it easy to work with multiple Docker based [VSCode Dev Containers](https://code.visualstudio.com/docs/devcontainers/containers) in the same repository. The template also makes it easy to transition projects to the cloud and production by including automated code quality checks, pytest configuration, CI pipeline templates and a sample for running on Azure Machine Learning.
[![CI](https://github.com/microsoft/dstoolkit-devcontainers/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/microsoft/dstoolkit-devcontainers/actions/workflows/ci.yaml)

A machine learning and data science project template that makes it easy to work with multiple Docker based [VSCode Dev Containers](https://code.visualstudio.com/docs/devcontainers/containers) in the same repository. The template leverages [uv](https://github.com/astral-sh/uv), an extremely fast Python package and project manager as a base for better productivity. The template also makes it easy to transition projects to the cloud and production by including automated code quality checks, pytest configuration, CI pipeline templates and a sample for running on Azure Machine Learning.

## Contents

Expand All @@ -11,6 +13,7 @@ A machine learning and data science project template that makes it easy to work
- [Getting Started](#getting-started)
- [How to setup dev environment?](#how-to-setup-dev-environment)
- [How to create a new directory under src with a new environment](#how-to-create-a-new-directory-under-src-with-a-new-environment)
- [How to update python packages in the dev container](#how-to-update-python-packages-in-the-dev-container)
- [Directory Structure](#directory-structure)
- [`notebooks` directory vs `src` directory](#notebooks-directory-vs-src-directory)
- [AML Example](#aml-example)
Expand All @@ -28,7 +31,7 @@ A machine learning and data science project template that makes it easy to work

This repository provides a [VSCode Dev Container](https://code.visualstudio.com/docs/devcontainers/containers) based project template that can help accelerate your Machine Learning inner-loop development phase. The template covers the phases from early ML experimentation (local training/testing) until production oriented ML model training (cloud based training/testing with bigger CPUs and GPUs).

During the early phase of Machine Learning project, you may face challenges such as each data scientist creating various different python environments that span across CPU and GPU that tend to have different setup procedures. With the power of Dev Containers, you can automate environment setup process across the team and every data scientist will get the exact same environment automatically. This template provides both CPU and GPU Dev Container setup as examples. To support multiple different ML approaches with different python environments to be experimented in one project, this solution allows multiple different Dev Containers to be used in one repository while having a "common" module that will be installed into all Dev Container to enable code reuse across different Dev Containers.
During the early phase of Machine Learning project, you may face challenges such as each data scientist creating various different python environments that span across CPU and GPU that tend to have different setup procedures. With the power of Dev Containers, you can automate environment setup process across the team and every data scientist will get the exact same environment automatically. This template provides both CPU and GPU Dev Container setup as examples. To support multiple different ML approaches with different python environments to be experimented in one project, this solution allows multiple different Dev Containers to be used in one repository.

Another challenge you may face is each data scientist creating a low quality codebase. That is fine during the experimentation stage to keep the team agility high and maximize your team’s experimentation throughput. But when you move to the model productionization stage, you experience the burden of bringing code quality up to production level. With the power of python tools and VSCode extensions configured for this template on top of Dev Containers, you can keep the code quality high automatically without losing your team’s agility and experimentation throughput and ease the transition to the productionization phase.

Expand Down Expand Up @@ -57,15 +60,31 @@ This section provides a comprehensive guide on how to set up a development envir
1. Run `Dev Containers: Open Folder in Container...` from the Command Palette (F1) and select the `notebooks` directory.
1. VS Code will then build and start up a container, connect this window to Dev Container: `notebooks`, and install VS Code extensions specified in `notebooks/.devcontainer/devcontainer.json`. `pre-commit install --overwrite` runs as part of `postCreateCommand` in `devcontainer.json` and this will setup your git precommit hook automatically.
1. Now set up is done. If you want to develop in another directory for example under `src`, run `Dev Containers: Open Folder in Container...` and go to that directory that has `.devcontainer` and that will setup an dev environment for that directory.
1. When you or others update either `requirements.txt` or `Dockerfile` in your working directory, make sure to rebuild your container to apply those changes to container. Run `Dev Containers: Rebuild and Reopen in Container...` for that.
1. When you or others update either `pyproject.toml` or `Dockerfile` in your working directory, make sure to rebuild your container to apply those changes to container. Run `Dev Containers: Rebuild and Reopen in Container...` for that.

## How to create a new directory under src with a new environment

1. Copy `src/sample_cpu_project/` under `src` and rename it. If you need gpu environment, base off of `src/sample_pytorch_gpu_project` instead
1. Update `COPY sample_cpu_project/.devcontainer/requirements.txt` in `Dockerfile` with a new path
1. Update other parts of `Dockerfile` if you need
1. Update `requirements.txt` if you need
1. Update `Dockerfile` if you need
1. Run `Dev Containers: Open Folder in Container...` from the Command Palette (F1) and select the new directory and make sure you can successfully open the new directory on VS Code running in a container
1. If you need to update python packages, stay inside DevContainer you just built and follow the steps below
1. Update `.devcontainer/pyproject.toml` and add/remove new python packages you need in `project.dependencies` section
1. Run `uv lock` to update the project's lockfile `.devcontainer/uv.lock` with the updated python packages. `UV_PROJECT` is already set automatically via `remoteEnv` in `devcontainer.json` so you don't need to manually specify the project path
1. Rerun `Dev Containers: Open Folder in Container...` from the Command Palette (F1) and select the new directory and make sure you can successfully open the new directory on VS Code running in a container

## How to update python packages in the dev container

This solution uses [uv](https://docs.astral.sh/uv) to manage python packages in the dev container. `uv` is a fast and efficient Python package and project manager that simplifies dependency management and ensures consistency across environments. It is installed in the dev container and is used to manage python packages in the dev container. `uv` is also used to create a lock file (`uv.lock`) that contains the list of all python packages and their versions that are installed in the dev container. This lock file is used to ensure that the same versions of the packages and dependencies are installed in every devcontainer build.

To manage Python packages within your active dev container, execute the following commands according to your needs:

- To add a package: `uv add requests`
- To add a specific version of a package: `uv add 'requests==2.31.0'`
- To remove a package: `uv remove requests`
- To upgrade a package: `uv lock --upgrade-package requests`

These commands update both `pyproject.toml` and `uv.lock` files automatically.
Check for more details at [The official documentation for how to manage dependencies in uv](https://docs.astral.sh/uv/guides/projects/#managing-dependencies)

## Directory Structure

Expand All @@ -83,17 +102,17 @@ This section gives you overview of the directory structure of this template. Onl
│ ├── .devcontainer # dev container related configuration files goes to here following VSCode convention
│ │ ├── devcontainer.json # dev container configuration and VS Code settings, extensions etc.
│ │ ├── Dockerfile # referred in devcontainer.json
│ │ └── requirements.txt # includes python package list for notebooks. used in Dockerfile
│ │ ├── pyproject.toml # includes python package list for notebooks. used in Dockerfile
│ │ └── uv.lock # lock file for python packages. used in Dockerfile
│ └── sample_notebook.py # example of interactive python script
├── pyproject.toml # Setting file for ruff, pytest and pytest-cov
└── src
├── common # this module is accessible from all modules under src. put functions you want to import across the projects here
│ └── requirements.txt # python package list for common module. installed in all Dockerfile under src. python tools for src goes to here too
├── sample_cpu_project # cpu project example. Setup process is covered in Section: How to setup dev environment?
│ ├── .devcontainer # dev container related configuration files goes to here following VSCode convention
│ │ ├── devcontainer.json # dev container configuration and VS Code settings, extensions etc.
│ │ ├── Dockerfile # referred in devcontainer.json. Supports only CPU
│ │ └── requirements.txt # includes python package list for sample_cpu_project. used in Dockerfile
│ │ ├── pyproject.toml # includes python package list for sample_cpu_project. used in Dockerfile
│ │ └── uv.lock # lock file for python packages. used in Dockerfile
│ ├── sample_main.py
│ └── tests # pytest scripts for sample_cpu_project goes here
│ └── test_dummy.py # pytest script example
Expand All @@ -102,7 +121,8 @@ This section gives you overview of the directory structure of this template. Onl
├── .devcontainer # dev container related configuration files goes to here following VSCode convention
│ ├── devcontainer.json # dev container configuration and VS Code settings, extensions etc.
│ ├── Dockerfile # referred in devcontainer.json. Supports GPU
│ └── requirements.txt # includes python package list for sample_pytorch_gpu_project. used in Dockerfile
│ ├── pyproject.toml # includes python package list for sample_pytorch_gpu_project. used in Dockerfile
│ └── uv.lock # lock file for python packages. used in Dockerfile
├── aml_example/ # Sample AML CLI v2 Components-based pipeline, including setup YAML. See sample_pytorch_gpu_project/README for full details of files in this directory.
├── sample_main.py
├── inference.py # Example pytorch inference/eval script that also works with aml_example
Expand Down Expand Up @@ -205,7 +225,6 @@ ssh-add
## Future Roadmap

- Add Docker build caching to Azure DevOps MS hosted CI pipeline
- Investigate making `src/common` installed with `pip -e`

## Contributing

Expand Down
2 changes: 1 addition & 1 deletion ci-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ for test_dir_parent in $(find "${repo_root}/src" -type d -name 'tests' -exec dir
count_test_py_files=$(find "${repo_root}/src/${test_dir_parent}/tests"/*.py 2>/dev/null | wc -l)
if [ $count_test_py_files != 0 ]; then
# Use the devcontainer Dockerfile to build a Docker image for the module to run tests
docker build "${repo_root}" -f "${repo_root}/src/${test_dir_parent}/.devcontainer/Dockerfile" -t "${test_dir_parent}"
docker build "${repo_root}/src/${test_dir_parent}/.devcontainer" -t "${test_dir_parent}"

echo "Running tests for ${test_dir_parent}, found ${count_test_py_files} test files"

Expand Down
35 changes: 21 additions & 14 deletions notebooks/.devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
FROM python:3.11.15
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/


# create non-root user and set the default user
ARG USERNAME=devuser
ARG USER_UID=1000
Expand All @@ -12,20 +15,24 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
apt-get update \
&& apt-get install -y sudo \
&& echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
&& chmod 0440 /etc/sudoers.d/$USERNAME
USER $USERNAME
&& chmod 0440 /etc/sudoers.d/$USERNAME \
&& rm -rf /var/lib/apt/lists/*

# Make pip-installed tools accessible
ENV PATH=$PATH:/home/$USERNAME/.local/bin
# it tends to improve startup time (at the cost of increased installation time)
ENV UV_COMPILE_BYTECODE=1
# This prevents uv from creating venv but instead makes it use the system python in container
ENV UV_PROJECT_ENVIRONMENT=/usr/local

# Install development tools (linters, formatters, test runners, etc.)
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --cache-dir=/root/.cache/pip pip --upgrade
COPY requirements-dev.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --cache-dir=/root/.cache/pip -r requirements-dev.txt
# Changing the default UV_LINK_MODE silences warnings about not being able to use hard links since the cache and sync target are on separate file systems
ENV UV_LINK_MODE=copy
# Install dependencies using bind mounts instead of COPY to avoid extra layers
# This is the recommended approach by uv: https://docs.astral.sh/uv/guides/integration/docker/#installing-a-project
RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
uv sync --locked

# Install notebooks related dependencies
COPY notebooks/.devcontainer/requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --cache-dir=/root/.cache/pip -r requirements.txt
# Allow devuser to manage packages at runtime without sudo (e.g. uv add)
RUN chown -R $USERNAME:$USERNAME /usr/local
USER $USERNAME
ENV PATH=$PATH:/home/$USERNAME/.local/bin
Loading
Loading