From 3d638c2480c2bf7ab8d45158356463353946c9b4 Mon Sep 17 00:00:00 2001 From: kKAPS22 Date: Mon, 23 Feb 2026 09:13:02 +0000 Subject: [PATCH 1/3] Add environment variables documentation --- .../environment-variables.md | 170 ++++++++++++++++++ 1 file changed, 170 insertions(+) create mode 100644 docs/metaflow/configuring-flows/environment-variables.md diff --git a/docs/metaflow/configuring-flows/environment-variables.md b/docs/metaflow/configuring-flows/environment-variables.md new file mode 100644 index 00000000..55895516 --- /dev/null +++ b/docs/metaflow/configuring-flows/environment-variables.md @@ -0,0 +1,170 @@ +--- +id: environment-variables +title: Environment Variables +--- + +Metaflow configuration values can be overridden using environment variables. + +All configuration variables defined in `metaflow/metaflow_config.py` +can be overridden using the following naming convention: +``` +METAFLOW_ +``` +For example, the configuration: +```python +DEFAULT_DATASTORE = from_conf("DEFAULT_DATASTORE", "local") +``` +can be overridden with: +```bash +export METAFLOW_DEFAULT_DATASTORE=s3 +``` +Environment variables take precedence over configuration files, +which in turn override internal defaults. +--- +## Commonly Used Environment Variables +Below are commonly configured environment variables grouped by category. +--- +### User & Runtime +### METAFLOW_USER +Overrides the detected username used for runs. +Useful in CI environments or containers where the system user cannot be determined. +```bash +export METAFLOW_USER=your_username +``` +--- +### METAFLOW_RUNTIME_NAME +Defines the runtime environment name associated with a run. +If not set, it defaults to `dev`. +Used by the metadata service to identify the runtime context of executions. +```bash +export METAFLOW_RUNTIME_NAME=prod +``` +--- +### METAFLOW_PRODUCTION_TOKEN +Defines a production token used for identifying or managing +production deployments. +This variable is used by deployment backends such as +AWS Step Functions, Argo Workflows, and Airflow. +```bash +export METAFLOW_PRODUCTION_TOKEN= +``` +--- +### METAFLOW_DEFAULT_ENVIRONMENT +Specifies the default execution environment. +Default: `local` +```bash +export METAFLOW_DEFAULT_ENVIRONMENT=local +``` +--- +### METAFLOW_DEFAULT_DATASTORE +Specifies the default datastore backend (`local`, `s3`, `azure`, `gs`). +Default: `local` +```bash +export METAFLOW_DEFAULT_DATASTORE=s3 +``` +--- +## Metadata Service +### METAFLOW_SERVICE_URL +Base URL for the Metaflow metadata service. +```bash +export METAFLOW_SERVICE_URL=https://metaflow.example.com +``` +--- +### METAFLOW_SERVICE_AUTH_KEY +Authentication key used when connecting to the metadata service. +```bash +export METAFLOW_SERVICE_AUTH_KEY= +``` +--- +## Datastore Configuration +### METAFLOW_DATASTORE_SYSROOT_LOCAL +Root directory for the local datastore. +--- +### METAFLOW_DATASTORE_SYSROOT_S3 +S3 root path used when running with the S3 datastore. +```bash +export METAFLOW_DATASTORE_SYSROOT_S3=s3://my-bucket/metaflow +``` +--- +### METAFLOW_DATASTORE_SYSROOT_AZURE +Azure Blob Storage root path. +--- +### METAFLOW_DATASTORE_SYSROOT_GS +Google Cloud Storage root path. +--- +## AWS Batch +### METAFLOW_BATCH_JOB_QUEUE +AWS Batch job queue used for execution. +```bash +export METAFLOW_BATCH_JOB_QUEUE=my-queue +``` +--- +### METAFLOW_BATCH_CONTAINER_IMAGE +Default container image used for AWS Batch jobs. +--- +## Kubernetes +### METAFLOW_KUBERNETES_NAMESPACE +Kubernetes namespace used for execution. +Default: `default` +```bash +export METAFLOW_KUBERNETES_NAMESPACE=ml-workflows +``` +--- +### METAFLOW_KUBERNETES_CONTAINER_IMAGE +Default container image used for Kubernetes jobs. +--- +## Secrets & Security +### METAFLOW_DEFAULT_SECRETS_BACKEND_TYPE +Specifies the default secrets backend. +--- +### METAFLOW_AWS_SECRETS_MANAGER_DEFAULT_REGION +AWS region used for AWS Secrets Manager. +--- +## Debugging +Debug flags can be enabled using: +``` +METAFLOW_DEBUG_ +``` +Examples: +```bash +export METAFLOW_DEBUG_S3CLIENT=True +export METAFLOW_DEBUG_TRACING=True +``` +--- +## Naming Rule +Every configuration defined as: +```python +SOME_NAME = from_conf("SOME_NAME", default) +``` +can be overridden using: +``` +METAFLOW_SOME_NAME +``` +Refer to `metaflow/metaflow_config.py` +for the authoritative and complete list of configuration values. +--- +## Troubleshooting +### "Unknown user" error +If you encounter an **"unknown user"** error: +```bash +export METAFLOW_USER=your_username +``` +--- +### Environment variable not taking effect +1. Ensure the variable is exported in your shell. +2. Restart your shell session if necessary. +3. Confirm it is set: +```bash +env | grep METAFLOW +``` +4. Verify that the variable name matches the pattern: +``` +METAFLOW_ +``` +--- +## Notes +- Environment variables override configuration files. +- Configuration files override internal defaults. +- Some runtime-specific variables (e.g., `METAFLOW_RUNTIME_NAME`, `METAFLOW_PRODUCTION_TOKEN`) are read directly from the environment and are not defined via `from_conf()` in `metaflow_config.py`. +- The complete and authoritative list of configuration values +is defined in `metaflow/metaflow_config.py`. From 1d6d59d4842e9cbc8f4e77d5e3a246737dd5a637 Mon Sep 17 00:00:00 2001 From: kKAPS22 Date: Mon, 23 Feb 2026 10:30:17 +0000 Subject: [PATCH 2/3] Better Version of Environment-variable.md --- .../environment-variables.md | 121 ++++++++++-------- 1 file changed, 69 insertions(+), 52 deletions(-) diff --git a/docs/metaflow/configuring-flows/environment-variables.md b/docs/metaflow/configuring-flows/environment-variables.md index 55895516..212401eb 100644 --- a/docs/metaflow/configuring-flows/environment-variables.md +++ b/docs/metaflow/configuring-flows/environment-variables.md @@ -20,27 +20,30 @@ export METAFLOW_DEFAULT_DATASTORE=s3 ``` Environment variables take precedence over configuration files, which in turn override internal defaults. + --- ## Commonly Used Environment Variables Below are commonly configured environment variables grouped by category. ---- + + ### User & Runtime -### METAFLOW_USER +#### METAFLOW_USER Overrides the detected username used for runs. Useful in CI environments or containers where the system user cannot be determined. ```bash export METAFLOW_USER=your_username ``` ---- -### METAFLOW_RUNTIME_NAME + +#### METAFLOW_RUNTIME_NAME Defines the runtime environment name associated with a run. If not set, it defaults to `dev`. Used by the metadata service to identify the runtime context of executions. ```bash export METAFLOW_RUNTIME_NAME=prod ``` ---- -### METAFLOW_PRODUCTION_TOKEN +> Note: This variable is read directly from the environment and does not use the `from_conf()` configuration pattern. + +#### METAFLOW_PRODUCTION_TOKEN Defines a production token used for identifying or managing production deployments. This variable is used by deployment backends such as @@ -48,79 +51,92 @@ AWS Step Functions, Argo Workflows, and Airflow. ```bash export METAFLOW_PRODUCTION_TOKEN= ``` ---- -### METAFLOW_DEFAULT_ENVIRONMENT +> Note: This variable is read directly from the environment and does not use the `from_conf()` configuration pattern. + + +#### METAFLOW_DEFAULT_ENVIRONMENT Specifies the default execution environment. Default: `local` ```bash export METAFLOW_DEFAULT_ENVIRONMENT=local ``` ---- -### METAFLOW_DEFAULT_DATASTORE + + +#### METAFLOW_DEFAULT_DATASTORE Specifies the default datastore backend (`local`, `s3`, `azure`, `gs`). Default: `local` ```bash export METAFLOW_DEFAULT_DATASTORE=s3 ``` ---- -## Metadata Service -### METAFLOW_SERVICE_URL + +### Metadata Service +#### METAFLOW_SERVICE_URL Base URL for the Metaflow metadata service. ```bash export METAFLOW_SERVICE_URL=https://metaflow.example.com ``` ---- -### METAFLOW_SERVICE_AUTH_KEY + + +#### METAFLOW_SERVICE_AUTH_KEY Authentication key used when connecting to the metadata service. ```bash export METAFLOW_SERVICE_AUTH_KEY= ``` ---- -## Datastore Configuration -### METAFLOW_DATASTORE_SYSROOT_LOCAL + + +### Datastore Configuration +#### METAFLOW_DATASTORE_SYSROOT_LOCAL Root directory for the local datastore. ---- -### METAFLOW_DATASTORE_SYSROOT_S3 + + +#### METAFLOW_DATASTORE_SYSROOT_S3 S3 root path used when running with the S3 datastore. ```bash export METAFLOW_DATASTORE_SYSROOT_S3=s3://my-bucket/metaflow ``` ---- -### METAFLOW_DATASTORE_SYSROOT_AZURE + + +#### METAFLOW_DATASTORE_SYSROOT_AZURE Azure Blob Storage root path. ---- -### METAFLOW_DATASTORE_SYSROOT_GS + + +#### METAFLOW_DATASTORE_SYSROOT_GS Google Cloud Storage root path. ---- -## AWS Batch -### METAFLOW_BATCH_JOB_QUEUE + + +### AWS Batch +#### METAFLOW_BATCH_JOB_QUEUE AWS Batch job queue used for execution. ```bash export METAFLOW_BATCH_JOB_QUEUE=my-queue ``` ---- -### METAFLOW_BATCH_CONTAINER_IMAGE + +#### METAFLOW_BATCH_CONTAINER_IMAGE Default container image used for AWS Batch jobs. ---- -## Kubernetes -### METAFLOW_KUBERNETES_NAMESPACE + + +### Kubernetes +#### METAFLOW_KUBERNETES_NAMESPACE Kubernetes namespace used for execution. Default: `default` ```bash export METAFLOW_KUBERNETES_NAMESPACE=ml-workflows ``` ---- -### METAFLOW_KUBERNETES_CONTAINER_IMAGE + +#### METAFLOW_KUBERNETES_CONTAINER_IMAGE Default container image used for Kubernetes jobs. ---- -## Secrets & Security -### METAFLOW_DEFAULT_SECRETS_BACKEND_TYPE + + +### Secrets & Security +#### METAFLOW_DEFAULT_SECRETS_BACKEND_TYPE Specifies the default secrets backend. ---- -### METAFLOW_AWS_SECRETS_MANAGER_DEFAULT_REGION + + +#### METAFLOW_AWS_SECRETS_MANAGER_DEFAULT_REGION AWS region used for AWS Secrets Manager. ---- -## Debugging + + +### Debugging Debug flags can be enabled using: ``` METAFLOW_DEBUG_ @@ -130,8 +146,8 @@ Examples: export METAFLOW_DEBUG_S3CLIENT=True export METAFLOW_DEBUG_TRACING=True ``` ---- -## Naming Rule + +### Naming Rule Every configuration defined as: ```python SOME_NAME = from_conf("SOME_NAME", default) @@ -142,15 +158,16 @@ METAFLOW_SOME_NAME ``` Refer to `metaflow/metaflow_config.py` for the authoritative and complete list of configuration values. ---- -## Troubleshooting -### "Unknown user" error + + +### Troubleshooting +#### "Unknown user" error If you encounter an **"unknown user"** error: ```bash export METAFLOW_USER=your_username ``` ---- -### Environment variable not taking effect + +#### Environment variable not taking effect 1. Ensure the variable is exported in your shell. 2. Restart your shell session if necessary. 3. Confirm it is set: @@ -161,10 +178,10 @@ env | grep METAFLOW ``` METAFLOW_ ``` ---- -## Notes + +### Notes - Environment variables override configuration files. - Configuration files override internal defaults. - Some runtime-specific variables (e.g., `METAFLOW_RUNTIME_NAME`, `METAFLOW_PRODUCTION_TOKEN`) are read directly from the environment and are not defined via `from_conf()` in `metaflow_config.py`. -- The complete and authoritative list of configuration values -is defined in `metaflow/metaflow_config.py`. +- The complete and authoritative list of configuration values is defined in `metaflow/metaflow_config.py`. + From d89ecd8348306b4c173c408694a19416cce8e88d Mon Sep 17 00:00:00 2001 From: kKAPS22 Date: Mon, 23 Feb 2026 12:31:30 +0000 Subject: [PATCH 3/3] docs: add comprehensive environment variables guide (incl. metadata config) --- .../environment-variables.md | 37 ++++++++----------- 1 file changed, 16 insertions(+), 21 deletions(-) diff --git a/docs/metaflow/configuring-flows/environment-variables.md b/docs/metaflow/configuring-flows/environment-variables.md index 212401eb..8fd90515 100644 --- a/docs/metaflow/configuring-flows/environment-variables.md +++ b/docs/metaflow/configuring-flows/environment-variables.md @@ -56,9 +56,10 @@ export METAFLOW_PRODUCTION_TOKEN= #### METAFLOW_DEFAULT_ENVIRONMENT Specifies the default execution environment. +Valid values: `local`, `conda`, `pypi`, `uv`. Default: `local` ```bash -export METAFLOW_DEFAULT_ENVIRONMENT=local +export METAFLOW_DEFAULT_ENVIRONMENT=pypi ``` @@ -70,13 +71,20 @@ export METAFLOW_DEFAULT_DATASTORE=s3 ``` ### Metadata Service +#### METAFLOW_DEFAULT_METADATA +Specifies the metadata provider to use. +Valid values: `local`, `service`. +Default: `local` +```bash +export METAFLOW_DEFAULT_METADATA=service +``` + #### METAFLOW_SERVICE_URL Base URL for the Metaflow metadata service. ```bash export METAFLOW_SERVICE_URL=https://metaflow.example.com ``` - #### METAFLOW_SERVICE_AUTH_KEY Authentication key used when connecting to the metadata service. ```bash @@ -143,31 +151,18 @@ METAFLOW_DEBUG_ ``` Examples: ```bash -export METAFLOW_DEBUG_S3CLIENT=True -export METAFLOW_DEBUG_TRACING=True +export METAFLOW_DEBUG_S3CLIENT=1 +export METAFLOW_DEBUG_TRACING=1 ``` -### Naming Rule -Every configuration defined as: -```python -SOME_NAME = from_conf("SOME_NAME", default) -``` -can be overridden using: -``` -METAFLOW_SOME_NAME -``` -Refer to `metaflow/metaflow_config.py` -for the authoritative and complete list of configuration values. - - -### Troubleshooting -#### "Unknown user" error +## Troubleshooting +### "Unknown user" error If you encounter an **"unknown user"** error: ```bash export METAFLOW_USER=your_username ``` -#### Environment variable not taking effect +### Environment variable not taking effect 1. Ensure the variable is exported in your shell. 2. Restart your shell session if necessary. 3. Confirm it is set: @@ -179,7 +174,7 @@ env | grep METAFLOW METAFLOW_ ``` -### Notes +## Notes - Environment variables override configuration files. - Configuration files override internal defaults. - Some runtime-specific variables (e.g., `METAFLOW_RUNTIME_NAME`, `METAFLOW_PRODUCTION_TOKEN`) are read directly from the environment and are not defined via `from_conf()` in `metaflow_config.py`.