This example demonstrates how to use devloop for a typical Python data science workflow, including Jupyter Lab development, model training, testing, and data preprocessing pipelines.
This project showcases a complete machine learning pipeline with:
- Data preprocessing that triggers when raw data changes
- Model training that triggers when source code or configs change
- Automated testing that runs when code is modified
- Jupyter Lab for interactive development and analysis
03-python-datascience/
├── .devloop.yaml # Devloop configuration
├── requirements.txt # Python dependencies
├── README.md # This file
├── Makefile # Build automation
├── src/ # Source code
│ ├── train.py # Model training script
│ └── data/
│ └── preprocess.py # Data preprocessing pipeline
├── tests/ # Test suite
│ ├── test_training.py # Training pipeline tests
│ └── test_preprocessing.py # Data preprocessing tests
├── notebooks/ # Jupyter notebooks
│ ├── 01_data_exploration.ipynb
│ └── 02_model_evaluation.ipynb
├── configs/ # Configuration files
│ └── model.yaml # Model training configuration
├── data/ # Data storage
│ ├── raw/ # Raw data files
│ └── processed/ # Processed data files
├── models/ # Trained models and metrics
└── logs/ # Log files from devloop rules
- Python 3.8 or higher
devloopinstalled (installation guide)
-
Navigate to the example directory:
cd examples/03-python-datascience -
Create a virtual environment (recommended):
python -m venv venv # or use python3 source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
Start devloop to watch for file changes and automatically manage your development workflow:
devloop -c .devloop.yamlThis will start four concurrent processes:
- Jupyter Lab (
jupyterrule) - Interactive development environment - Model Training (
trainrule) - Automatic retraining when code/config changes - Tests (
testrule) - Continuous testing when code changes - Data Pipeline (
pipelinerule) - Data preprocessing when raw data changes
You can also run individual components manually:
# Run data preprocessing
python src/data/preprocess.py
# Train model
python src/train.py --config configs/model.yaml
# Run tests
pytest tests/ -v
# Start Jupyter Lab
jupyter lab --no-browser --port=8888When you first run devloop, it will:
- Generate sample datasets in
data/raw/ - Process the data and save to
data/processed/ - Train an initial model and save to
models/ - Start Jupyter Lab on port 8888
- Run the test suite
Modify Python source files (in src/):
- Triggers model retraining
- Runs test suite
- Updates Jupyter Lab environment
Modify configuration (in configs/):
- Triggers model retraining with new parameters
- Saves new model and metrics
Modify raw data (in data/raw/):
- Triggers data preprocessing pipeline
- Updates processed datasets
- May trigger model retraining if using processed data
Modify Jupyter notebooks (in notebooks/):
- Restarts Jupyter Lab server
- Preserves notebook state and outputs
Each rule outputs logs with prefixes for easy identification:
[jupyter]- Jupyter Lab server logs[train]- Model training progress and metrics[test]- Test execution results[pipeline]- Data preprocessing status
watch:
- action: "include"
patterns:
- "src/**/*.py" # All Python files in src/
- "configs/**/*.yaml" # All YAML configs
- action: "exclude"
patterns:
- "**/__pycache__/**" # Ignore Python cache
- "**/*.pyc" # Ignore compiled Python filescommands:
- "echo 'Starting model training...'"
- "python src/train.py --config configs/model.yaml"workdir: "." # Run commands from project rootsettings:
prefix_logs: true
prefix_max_length: 10To add a new development task, edit .devloop.yaml:
rules:
- name: "Code Formatting"
prefix: "format"
watch:
- action: "include"
patterns:
- "src/**/*.py"
commands:
- "black src/"
- "flake8 src/"Edit configs/model.yaml to change model training parameters:
model_params:
n_estimators: 200 # Increase number of trees
max_depth: 15 # Allow deeper trees
random_state: 42 # Keep reproducible resultsAdd new packages to requirements.txt and reinstall:
echo "xgboost>=1.6.0" >> requirements.txt
pip install -r requirements.txt- Start devloop:
devloop -c .devloop.yaml - Open Jupyter Lab: http://localhost:8888
- Open
notebooks/01_data_exploration.ipynb - Modify the notebook - Jupyter will restart automatically
- Changes to
src/files are reflected immediately in notebooks
- Edit
src/train.pyto modify the training pipeline - Save the file - devloop automatically retrains the model
- Check
models/metrics.yamlfor updated performance metrics - View results in
notebooks/02_model_evaluation.ipynb
- Write new tests in
tests/ - Modify source code in
src/ - Tests run automatically on every change
- Fix failures and see immediate feedback
- Add new raw data files to
data/raw/ - Preprocessing runs automatically
- Check
data/processed/for updated datasets - Model retraining may trigger if using processed data
This example works great with various development environments:
- Install Python extension
- Use integrated terminal to run
devloop - Edit files normally - devloop handles the rest
- Open project directory
- Use terminal to run
devloop - Leverage PyCharm's debugging with running processes
- Run devloop in a tmux/screen session
- Edit files as usual
- Check devloop output for immediate feedback
-
Exclude unnecessary files from watching:
- action: "exclude" patterns: - "models/**" # Don't watch model outputs - "logs/**" # Don't watch log files - ".git/**" # Don't watch git files
-
Use specific patterns instead of
**/*:- action: "include" patterns: - "src/**/*.py" # Only Python files in src
-
Optimize Jupyter startup for faster restarts:
jupyter lab --no-browser --port=8888 --NotebookApp.token=''
Port 8888 already in use:
# Find and kill existing Jupyter processes
lsof -ti:8888 | xargs kill -9
# Or use a different port in .devloop.yamlModule import errors:
- Ensure virtual environment is activated
- Check that
src/is in Python path - Verify all dependencies are installed
Model training fails:
- Check that
configs/model.yamlexists - Verify data files are present
- Review training logs for specific errors
Tests fail:
- Run tests manually first:
pytest tests/ -v - Check test dependencies are installed
- Ensure test data is available
- Check devloop logs for error messages
- Run individual commands manually to isolate issues
- Verify file permissions and paths
- Check Python environment and dependencies
After exploring this example:
- Customize for your project: Adapt the structure and configuration
- Add more rules: Include linting, documentation generation, etc.
- Scale up: Use devloop's agent/gateway mode for multi-project setups
- Integrate CI/CD: Use similar patterns in your deployment pipeline
- Full-Stack Web Application - Multi-language development
- Microservices - Distributed development with gateway mode
- Docker Integration - Container-based development