Skip to content

Kubernetes Variables for AWS MWAA support with Airflow#1234

Closed
valayDave wants to merge 4 commits intovalay/airflow-gcp-finalfrom
valay/airflow-mwaa-support
Closed

Kubernetes Variables for AWS MWAA support with Airflow#1234
valayDave wants to merge 4 commits intovalay/airflow-gcp-finalfrom
valay/airflow-mwaa-support

Conversation

@valayDave
Copy link
Copy Markdown
Collaborator

Stacked on top of #1226

Introduces new config variables :

METAFLOW_AIRFLOW_KUBERNETES_KUBECONFIG_CONTEXT: Context to use in the kubeconfig file.

METAFLOW_AIRFLOW_KUBERNETES_KUBECONFIG_FILE: Path to the kubeconfig file to provide the Airflow KubernetesOperator

Setting Either of these or METAFLOW_AIRFLOW_KUBERNETES_CONN_ID will set in_cluster=True in the Airflow KubernetesOperator

Changes done to support AWS MWAA

@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 4499490 to 2a55cb6 Compare January 23, 2023 22:46
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 5c97b11 to 172c228 Compare January 23, 2023 22:47
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 2a55cb6 to ecc43a0 Compare January 23, 2023 23:00
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 172c228 to fbe23d8 Compare January 23, 2023 23:01
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from ecc43a0 to 8c9881f Compare January 24, 2023 00:09
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from fbe23d8 to aff5a37 Compare January 24, 2023 00:10
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 8c9881f to 322969f Compare January 24, 2023 00:20
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from aff5a37 to 0821a0a Compare January 24, 2023 00:20
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 322969f to 68d46d9 Compare January 24, 2023 00:28
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 0821a0a to 85f219c Compare January 24, 2023 00:28
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 68d46d9 to 110ec3b Compare January 27, 2023 19:48
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 832191e to 98f2db1 Compare January 27, 2023 19:48
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from 110ec3b to cbc9b8e Compare January 27, 2023 19:54
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 98f2db1 to 5adba1f Compare January 27, 2023 19:54
@valayDave valayDave requested a review from savingoyal January 27, 2023 19:57
@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from cbc9b8e to b0f7b72 Compare January 27, 2023 21:26
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 5adba1f to 3862a14 Compare January 27, 2023 21:26
Copy link
Copy Markdown
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not review. No concern except the proliferation of config variables...

Copy link
Copy Markdown
Collaborator Author

@valayDave valayDave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lopped some small changes relating to DOCStrings + SQL Sensor Support for Airflow version > 2.4.0

Comment on lines +21 to +60
"""
The `@airflow_external_task_sensor` decorator attaches a Airflow [ExternalTaskSensor](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/sensors/external_task/index.html#airflow.sensors.external_task.ExternalTaskSensor) before the start step of the flow.
This decorator only works when a flow is scheduled on Airflow and is compiled using `airflow create`. More than one `@airflow_external_task_sensor` can be added as a flow decorators. Adding more than one decorator will ensure that `start` step starts only after all sensors finish.

Parameters
----------
timeout : int
Time, in seconds before the task times out and fails. (Default: 3600)
poke_interval : int
Time in seconds that the job should wait in between each try. (Default: 60)
mode : string
How the sensor operates. Options are: { poke | reschedule }. (Default: "poke")
exponential_backoff : bool
allow progressive longer waits between pokes by using exponential backoff algorithm. (Default: True)
pool : string
the slot pool this task should run in,
slot pools are a way to limit concurrency for certain tasks. (Default:None)
soft_fail : bool
Set to true to mark the task as SKIPPED on failure. (Default: False)
name : string
Name of the sensor on Airflow
description : string
Description of sensor in the Airflow UI
external_dag_id : string
The dag_id that contains the task you want to wait for.
external_task_ids : List[string]
The list of task_ids that you want to wait for.
If None (default value) the sensor waits for the DAG. (Default: None)
allowed_states : List[string]
Iterable of allowed states, (Default: ['success'])
failed_states : List[string]
Iterable of failed or dis-allowed states. (Default: None)
execution_delta : datetime.timedelta
time difference with the previous execution to look at,
the default is the same logical date as the current task or DAG. (Default: None)
check_existence: bool
Set to True to check if the external task exists or check if
the DAG to wait for exists. (Default: True)
"""

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@savingoyal : Added Docstrings for all 3 decorators. Please review when you get the chance.

Comment on lines +427 to +437
try:
from airflow.sensors.sql import SqlSensor
except ImportError as e:
try:
from airflow.providers.common.sql.sensors.sql import SqlSensor
except ImportError as e:
raise AirflowSensorNotFound(
"This DAG requires a `SqlSensor`. "
"Install the Airflow SQL provider using : "
"`pip install apache-airflow-providers-common-sql`"
)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to make this change so that SQL sensors can be supported. Airflow Recently removed SQL sensors from Airflow v2.4.0 . This is why we have this additional code path added here.

@valayDave valayDave force-pushed the valay/airflow-gcp-final branch from f1b429f to a1edca6 Compare January 31, 2023 00:47
@valayDave valayDave force-pushed the valay/airflow-mwaa-support branch from 0f00a44 to cb27496 Compare January 31, 2023 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants