Quick setup for Domino Model Monitor with automated prediction capture and ground truth tracking.
In addition to the scripts needed to set up monitoring, there is an example folder that provides a complete working example with a trained Random Forest model and deployment script ready for monitoring setup. There is also an app-model-monitor-extension folder which contains a Streamlit dashboard for visualizing model monitoring metrics and comparing models side-by-side. See the INSTRUCTIONS.md files in each folder for detailed usage instructions.
- Access to create or use a Domino data source
- Running in a Domino workspace (for automatic API key access)
Note: DOMINO_USER_API_KEY is automatically available in Domino workspaces - no manual configuration needed.
Register your training data as a baseline for drift detection. Note that the training set must have a unique name across your Domino instance.
python 1_register_training_set.py --file /path/to/training.csv --name "My Training Set"Note: Training set names are automatically cleaned (spaces β underscores) and made unique with your username suffix.
The script accepts:
- CSV files (
.csv) - Parquet files (
.parquet,.pq) - Any pandas-readable format
Note: Feature names in your training data should match the features captured during prediction.
If you need to export a model from a registered Domino model:
python 2_optional_export_model.py --metric accuracy
python 2_optional_export_model.py --run-id YOUR_RUN_IDModify your model deployment script to capture predictions. See prediction_capture_example.py for reference.
Basic Setup:
from domino_data_capture.data_capture_client import DataCaptureClient
import uuid
from datetime import datetime, timezone
# Define feature and prediction names
feature_names = ['feature_1', 'feature_2', 'feature_3']
predict_names = ['predicted_class', 'confidence_score']
# Initialize client
data_capture_client = DataCaptureClient(feature_names, predict_names)
def predict(input_data):
# Your prediction logic
feature_values = [input_data['feature_1'], input_data['feature_2'], input_data['feature_3']]
predicted_class = model.predict(feature_values)
confidence = model.predict_proba(feature_values).max()
# Capture prediction
event_id = str(uuid.uuid4())
event_time = datetime.now(timezone.utc).isoformat()
data_capture_client.capturePrediction(
feature_values,
[predicted_class, confidence],
event_id=event_id,
timestamp=event_time
)
return {
"predicted_class": predicted_class,
"confidence_score": confidence,
"event_id": event_id,
"timestamp": event_time
}Note: Feature values must be explicitly provided to capturePrediction() in the order defined in feature_names.
Deploy your model as a Domino Model API endpoint:
- Navigate to your project in Domino
- Go to Deployments > Model APIs
- Click New Model API
- Configure:
- File: Your deployment script (e.g.,
predict.py) - Function: Your prediction function (e.g.,
predict) - Environment: Select appropriate environment
- File: Your deployment script (e.g.,
- Click Publish
π Details: Deploy from the UI
Once your model endpoint is running, enable drift detection by selecting the training set:
- Navigate to your deployed model endpoint
- Go to Settings > Monitoring
- Select the training set you registered in Step 1
- Click Save
This enables automatic drift detection based on your training data baseline.
π Details: Endpoint Drift Detection
You need to configure data sources in three places:
First, create a data source in Domino for storing ground truth data (note that an Admin may have done this for you):
- Navigate to Data > Data Sources in Domino UI
- Click New Data Source
- Select your storage type (S3, Azure Blob, GCS, etc.)
- Configure connection settings and choose a descriptive name (e.g.,
my-model-ground-truth)
π Details: Connect a Data Source
Make the data source available in your project:
- Go to Project Settings > Data
- Click Add Data Source
- Select your ground truth data source
π Details: Use Data Sources
Configure the same data source in Model Monitor for ground truth ingestion (note that an Admin may have done this for you):
- Navigate to Model Monitor in Domino UI
- Go to the Monitoring Data Sources page
- Click + Add Data Source
- Add the data source you created in step 4a
Important: Use the same name for the data source in both places (main data source and Model Monitor) to avoid confusion.
Run the interactive setup to configure monitoring:
python 3_setup_monitoring.pyYou'll be prompted for:
- Domino base URL - Your Domino instance URL
- Model API endpoint URL - From your deployed model
- Model API token - From model API settings
- Model Monitor ID - From Model Monitor UI
- Ground truth data source name - Name from Step 3
- Test data path - Location of test data for predictions
The script creates monitoring_config.json with your settings.
Generate test predictions and upload ground truth:
python 4_generate_predictions.py --count 30This will:
- Call your model API with test data
- Capture predictions automatically (via your deployment script)
- Upload ground truth to the data source
β° IMPORTANT: Wait at least 1 hour after generating predictions before running this step.
This allows ground truth data to be fully uploaded to S3.
python 5_upload_ground_truth.py
python 5_upload_ground_truth.py --hours 168 # Last 7 daysThis registers the ground truth datasets with Model Monitor for quality metrics.
- Check Predictions: Navigate to Model Monitor UI β Model Details β Predictions
- Check Ground Truth: Model Monitor UI β Ground Truth Status
- Check Metrics: Quality metrics appear after ground truth is matched (24-48 hours)
1_register_training_set.py- Register training baselineprediction_capture_example.py- Example code for deployment script2_optional_export_model.py- Export models from MLflow (optional)3_setup_monitoring.py- Interactive configuration4_generate_predictions.py- Generate test predictions5_upload_ground_truth.py- Register ground truth datasetsconfig_loader.py- Configuration management utility
The scripts 4_generate_predictions.py and 5_upload_ground_truth.py are designed to work with different model types and data formats. Here's what you need to customize:
Required Customizations:
-
Model API Input Format (
call_model_apimethod):# Update payload structure to match your model's expected input payload = {'data': your_input_format}
-
Ground Truth Extraction (Line ~137):
# Option 1: From folder structure (current) actual_class = file_path.parent.name # Option 2: From CSV data # df = pd.read_csv(file_path, nrows=1) # actual_class = df['your_target_column'].iloc[0] # Option 3: From filename pattern # actual_class = file_path.stem.split('_')[0]
-
Response Parsing (
call_model_apimethod):# Update to match your model's output format return { 'predicted_class': result['your_prediction_key'], 'confidence_score': result['your_confidence_key'], 'event_id': result.get('event_id'), 'timestamp': result.get('timestamp') }
-
Ground Truth Column Name (Line ~151):
'target': actual_class, # Change 'target' to match your training data column
Command Line Options:
# For classification models (default)
python 5_upload_ground_truth.py --ground-truth-column "target"
# For regression models
python 5_upload_ground_truth.py --ground-truth-column "score" --regression
# Standard usage with S3 (default)
python 5_upload_ground_truth.pyRequired Customizations:
- Ground Truth Column Name: Must match your training data
- Model Type: Use
--regressionflag for numerical targets
Image Classification:
- Ground truth from folder names:
actual_class = file_path.parent.name - Base64 image input to API
Tabular Data:
- Ground truth from CSV column:
actual_class = df['target'].iloc[0] - JSON payload with feature values
Regression Models:
- Use
--regressionflag in script 5 - Ensure numerical targets in ground truth
Data Source Configuration:
- Verify data source configuration in Domino UI
- Ensure storage credentials and permissions are correct
No quality metrics appearing:
- Verify ground truth status is "Active" in Model Monitor UI
- Ensure
event_idmatches between predictions and ground truth - Wait 24-48 hours for initial ingestion
Configuration errors:
- Run
python 3_setup_monitoring.pyto reconfigure - Check data source name matches exactly
Upload failures:
- Verify data source is connected to project
- Check credentials and permissions
- Ensure Model Monitor ID is correct