Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/best-practices/vscode/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: VS Code
sidebar_position: 5
---
# VS Cod Best Practices and Tips
File renamed without changes.
8 changes: 8 additions & 0 deletions docs/getting-started/Admin/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
label: Administrator
position: 3
link:
type: generated-index
title: Administrator
description: Welcome Administrator! You have the important role of configuring the Datacoves platform to fit your needs. No worries, you are not alone. We are here to help you every step of the way so you and your team can start delivering valuable insights in no time! Please follow the documentation below to get your dbt project up and running.


2 changes: 1 addition & 1 deletion docs/getting-started/Admin/configure-airflow.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Configure Airflow
title: Airflow - Configure
sidebar_position: 2
---
# Configuring Airflow
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Configure Git Repository Using dbt-coves
sidebar_position: 4
sidebar_position: 12
---
# Initial Datacoves Repository Setup

Expand Down
4 changes: 2 additions & 2 deletions docs/getting-started/Admin/configure-repository.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Configure Git Repository
sidebar_position: 3
sidebar_position: 8
---
# Update Repository for Airflow

Expand All @@ -12,7 +12,7 @@ Now that you have configured your Airflow settings you must ensure that your rep

**Step 3:** **This step is optional** if you would like to make use of the [dbt-coves](https://github.com/datacoves/dbt-coves?tab=readme-ov-file#airflow-dags-generation-arguments) `dbt-coves generate airflow-dags` command. Create the `dags_yml_definitions` folder inside of your newly created `orchestrate` folder. This will leave you with two folders inside `orchestrate`- `orchestrate/dags` and `orchestrate/dags_yml_definitions`.

**Step 4:** **This step is optional** if you would like to make use of the dbt-coves' extension `dbt-coves generate airflow-dags` command. You must create a config file for dbt-coves. Please follow the [generate DAGs from yml](how-tos/airflow/generate-dags-from-yml.md) docs.
**Step 4:** **This step is optional** if you would like to make use of the dbt-coves' extension `dbt-coves generate airflow-dags` command. You must create a config file for dbt-coves. Please follow the [generate DAGs from yml](how-tos/airflow/DAGs/generate-dags-from-yml.md) docs.

## Create a profiles.yml

Expand Down
10 changes: 5 additions & 5 deletions docs/getting-started/Admin/creating-airflow-dags.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Creating Airflow Dags
sidebar_position: 7
title: Airflow - Creating Dags
sidebar_position: 3
---
# Creating Airflow Dags

Expand All @@ -19,12 +19,12 @@ During the Airflow configuration step you added the `orchestrate` folder and the
## DAG 101 in Datacoves
1. If you are eager to see Airflow and dbt in action within Datacoves, here is the simplest way to run dbt with Airflow.

[Run dbt](/docs/how-tos/airflow/run-dbt)
[Run dbt](/docs/how-tos/airflow/DAGs/run-dbt)

2. You have 2 options when it comes to writing DAGs in Datacoves. You can write them out using Python and place them in the `orchestrate/dags` directory, or you can generate your DAGs with `dbt-coves` from a YML definition.

[Generate DAGs from yml definitions](/docs/how-tos/airflow/generate-dags-from-yml) this is simpler for users not accustomed to using Python
[Generate DAGs from yml definitions](/docs/how-tos/airflow/DAGs/generate-dags-from-yml) this is simpler for users not accustomed to using Python

3. You may also wish to use external libraries in your DAGs such as Pandas. In order to do that effectively, you can create custom Python scripts in a separate directory such as `orchestrate/python_scripts` and use the `DatacovesBashOperator` to handle all the behind the scenes work as well as run your custom script.**You will need to contact us beforehand to pre-configure any python libraries you need.**

[External Python DAG](/docs/how-tos/airflow/external-python-dag)
[External Python DAG](/docs/how-tos/airflow/DAGs/external-python-dag)
2 changes: 1 addition & 1 deletion docs/getting-started/Admin/user-management.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: User Management
sidebar_position: 8
sidebar_position: 20
---
# User Management

Expand Down
7 changes: 7 additions & 0 deletions docs/getting-started/developer/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
label: Developer
position: 5
link:
type: generated-index
title: Developer
description: Welcome Developer! Please use these getting started guides to accelerate your onboarding. The guides cover everything from configuring your user settings to fundamental tips and shortcuts.

2 changes: 1 addition & 1 deletion docs/how-tos/_category_.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
label: How-tos
position: 2
position: 1
link:
type: generated-index
title: How-to Guides
Expand Down
6 changes: 6 additions & 0 deletions docs/how-tos/airflow/DAGs/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
label: DAGs
position: 11
link:
type: generated-index
title: DAGs
description: Here you will find information on DAGs in Datacoves.
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ You can use Airflow in Datacoves to trigger a Microsoft Azure Data Factory pipel

**Step 1:** In Datacoves, a user with the `securityadmin` role must go to the `Airflow Admin -> Connection` menu.

![Airflow Connection](assets/admin-connections.png)
![Airflow Connection](./assets/admin-connections.png)

**Step 2:** Create a new connection using the following details.

Expand All @@ -58,7 +58,7 @@ You can use Airflow in Datacoves to trigger a Microsoft Azure Data Factory pipel
Replace the values in the screenshot below with the actual values found above.
:::

![adf connection](assets/airflow_adf_connection.png)
![adf connection](./assets/airflow_adf_connection.png)

## Example DAG

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Support for Fivetran Tasks coming soon. More Information in [run Fivetran sync j

### Ensure your Airflow environment is properly configured

Follow this guide on [How to set up Airflow](./initial-setup)'s environment.
Follow this guide on [How to set up Airflow](../initial-setup)'s environment.

### Airbyte connection

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ You can use Airflow in Datacoves to trigger a Databricks notebook. This guide wi

**Step 2:** Navigate to compute.

![databricks compute](assets/databricks_compute.png)
![databricks compute](./assets/databricks_compute.png)

**Step 3:** Click on your desired cluster.

Expand All @@ -37,7 +37,7 @@ If you do not have admin privileges, work with an admin to get the token. Follow

**Step 2:** To the right of the notebook name, there will be three dots. Click on this and select the option to copy the full path to your clipboard.

![copy url](assets/databricks_copyurl.png)
![copy url](./assets/databricks_copyurl.png)

## Handling Databricks Variables in Airflow

Expand All @@ -56,7 +56,7 @@ It is possible to hardcode these two variables in your DAG if you don’t see th

**Step 1:** A user with Airflow admin privileges must go to the `Airflow Admin -> Connection` menu.

![admin connection](assets/admin-connections.png)
![admin connection](./assets/admin-connections.png)

**Step 2:** Create a new connection using the following details:

Expand All @@ -65,7 +65,7 @@ It is possible to hardcode these two variables in your DAG if you don’t see th
- **Host:** Your Databricks host. E.g. `https://<databricks-instance>.databricks.com`
- **Password:** Enter your `Databricks Token`

![Databricks Connection](assets/airflow_databricks_connection.png)
![Databricks Connection](./assets/airflow_databricks_connection.png)

**Step 3:** Click `Save`

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 31
---
# Run Fivetran sync jobs

In Addition to triggering Airbyte loads jobs [run Airbyte sync jobs](/docs/how-tos/airflow/run-airbyte-sync-jobs) you can also trigger Fivetran jobs from your Airflow DAG.
In Addition to triggering Airbyte loads jobs [run Airbyte sync jobs](/docs/how-tos/airflow/DAGs/run-airbyte-sync-jobs) you can also trigger Fivetran jobs from your Airflow DAG.

## Before you start

Expand Down Expand Up @@ -108,7 +108,7 @@ dag = daily_loan_run()
### Fields reference

- **extract_and_load_fivetran**: The name of the task group. This can be named whatever you like and will show up in airflow.
![Extract and Load DAG](assets/extract_load_airflow_dag.png)
![Extract and Load DAG](./assets/extract_load_airflow_dag.png)
- **tooltip**: The tooltip argument allows you to provide explanatory text or helpful hints about specific elements in the Airflow UI
- **tasks**: Define all of your tasks within the task group.

Expand All @@ -118,7 +118,7 @@ You will need to define two operators: `fivetran_provider.operators.fivetran.Fiv
- **operator**: `fivetran_provider.operators.fivetran.FivetranOperator`
- **connector_id**: Find in Fivetran UI. Select your desired source. Click into `Setup` and locate the `Fivetran Connector ID`

![Fivetran Connection ID](assets/fivetran_connector_id.png)
![Fivetran Connection ID](./assets/fivetran_connector_id.png)

- **do_xcom_push**: Indicate that the output of the task should be sent to XCom, making it available for other tasks to use.
- **fivetran_conn_id**: This is the `connection_id` that was configured above in the Fivetran UI as seen [above](#fivetran-connection).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ CREATE OR REPLACE STAGE TEST_STAGE

This example will use SnowSQL and our data looks like this:

![Sample data](assets/s3_sample_data.jpg)
![Sample data](../assets/s3_sample_data.jpg)

The SnowSQL command to upload the file to the Snowflake stage is below:
```
Expand All @@ -61,11 +61,11 @@ CREATE OR REPLACE TABLE TEST_TABLE

The TEST_TABLE schema now looks like this:

![test_table](assets/s3_test_table_schema.jpg)
![test_table](../assets/s3_test_table_schema.jpg)

However, the table does not have any data loaded into it.

![Empty test_table](assets/s3_test_table_empty.jpg)
![Empty test_table](../assets/s3_test_table_empty.jpg)

**Step 6:** To load the data from the file we used as a template we use the following COPY INTO SQL.

Expand All @@ -78,11 +78,11 @@ COPY INTO TEST_TABLE

And we can now see the data in the table:

![test_table copied](assets/s3_test_table_copied.jpg)
![test_table copied](../assets/s3_test_table_copied.jpg)

**Step 7:** Now we’re going to load another file into TEST_TABLE that has an additional column.

![Test data additional column](assets/s3_test_table_additional_col.jpg)
![Test data additional column](../assets/s3_test_table_additional_col.jpg)

Again, we will use the SnowSQL PUT command seen below:

Expand All @@ -100,7 +100,7 @@ COPY INTO TEST_TABLE

And now the table has an additional column called COUNTRY_CODE:

![test_table additional column](assets/s3_test_table_additional_call_snowflake.jpg)
![test_table additional column](../assets/s3_test_table_additional_call_snowflake.jpg)

## Loading JSON data into a variant column

Expand Down Expand Up @@ -136,5 +136,5 @@ COPY INTO VARIANT_TABLE

Our variant table now looks like this:

![json variant table](assets/json_variant_table.jpg)
![json variant table](../assets/json_variant_table.jpg)

Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ dbt-coves generate airflow-dags does not support reading variables/connections,
:::
The best way to store and retrieve information within Airflow is to use `Variables` and `Connections`, both available on the `Admin` upper dropdown.

![select More](./assets/variables_connections_ui.png)
![select More](../assets/variables_connections_ui.png)

The main difference between them is that [Variables](https://airflow.apache.org/docs/apache-airflow/2.3.1/howto/variable.html) is a generic multi-purpose store, while [Connections](https://airflow.apache.org/docs/apache-airflow/2.3.1/howto/connection.html) are aimed at third-party providers.

:::tip
Rather than using connections or variables stored in Airflow’s database, we recommend using a Secrets Manager. These secrets are encrypted and can be stored either in [Datacoves Secrets manager](./use-datacoves-secrets-manager.mdx) or a third-party secrets manager like [AWS Secrets Manager](./use-aws-secrets-manager.mdx)
Rather than using connections or variables stored in Airflow’s database, we recommend using a Secrets Manager. These secrets are encrypted and can be stored either in [Datacoves Secrets manager](../use-datacoves-secrets-manager.mdx) or a third-party secrets manager like [AWS Secrets Manager](../use-aws-secrets-manager.mdx)
:::

## Usage
Expand All @@ -24,7 +24,7 @@ Rather than using connections or variables stored in Airflow’s database, we re
After creating a variable in Airflow's UI, using it is as simple as importing the `Variable` model in your DAG and `getting` it's name. If a variable contains `SECRET` on it's name, value will be hidden:


![select More](./assets/variable_creation.png)
![select More](../assets/variable_creation.png)

```python
from pendulum import datetime
Expand Down Expand Up @@ -56,7 +56,7 @@ Consuming connections data is also straightforward, though you need to take it's

In the following example, a connection of `type Airbyte` is created, and it's `host` is echoed in a DAG.

![select More](./assets/connection_creation.png)
![select More](../assets/connection_creation.png)

```python
from pendulum import datetime
Expand Down
19 changes: 4 additions & 15 deletions docs/how-tos/airflow/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,9 @@
---
title: Airflow
sidebar_position: 15
title: Airflow - What to know
sidebar_position: 1
id: airflow-index
---
# Airflow in Datacoves

These how to guides are dedicated to Airflow within Datacoves. Here you will find information on how to **Enable and configure Airflow**

* Create Airflow jobs(DAGs)
* Run dbt with in an Airflow DAG
* Sending notifications
* Customizing Airflow worker environments (docker images)
* Requesting Airflow worker resources

And more!

## What to know
# Airflow - What to know
- `Ruff` is installed to show unused imports and unused variables as well as python linting.
- [Datacoves Decorators](/reference/airflow/datacoves-decorators.md) simplify working with dbt, syncing databases, and running commands in Airflow.
- [My Airflow](/docs/how-tos/my_airflow/) can help speed up your DAG writing experience.
6 changes: 6 additions & 0 deletions docs/how-tos/airflow/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
label: Airflow
position: 1
link:
type: generated-index
title: Airflow in Datacoves
description: These how-to guides are dedicated to Airflow within Datacoves. Here you will find information on how to enable and configure Airflow.
6 changes: 0 additions & 6 deletions docs/how-tos/airflow/_category_bk.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion docs/how-tos/airflow/send-emails.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ In Datacoves 3.3 and up, the `SMTP` will be automatically added to your environm

Viola!🎉 The Airflow service will be restarted shortly and will now include the SMTP configuration required to send emails.

:::note **Getting Started Guide:** If you are making your way through our [getting started guide](../../getting-started/Admin/), please continue on to [developing DAGs](getting-started/Admin/creating-airflow-dags.md).
:::note **Getting Started Guide:** If you are making your way through our [getting started guide](/docs/category/administrator), please continue on to [developing DAGs](getting-started/Admin/creating-airflow-dags.md).
:::
## Set up a custom SMTP (Optional)

Expand Down
File renamed without changes.
6 changes: 6 additions & 0 deletions docs/how-tos/datacoves/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
label: Datacoves
position: 2
link:
type: generated-index
title: Datacoves
description: These how to guides are dedicated to **Datacoves admin configurations**.
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Please, follow the [AWS Secrets Manager documentation](https://docs.aws.amazon.c

![Secrets Backend](../assets/aws_secrets_connection.jpg)

To learn how to read a variable from the AWS Secrets Manager check out our [How To](../README.md)
To learn how to read a variable from the AWS Secrets Manager check out our [How To](/docs/category/how-tos/)

:::tip
For security purposes, once this has been saved you will not be able to view the values. To modify the Secrets backend you will need to set the Secrets backend to `None` and save the changes. Then start the setup again.
Expand Down
File renamed without changes.
7 changes: 7 additions & 0 deletions docs/how-tos/datahub/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

label: Datahub
position: 3
link:
type: generated-index
title: Datahub
description: These how to guides are dedicated to DataHub in Datacoves.
2 changes: 1 addition & 1 deletion docs/how-tos/dataops/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: DataOps
sidebar_position: 62
sidebar_position: 4
---
# DataOps in Datacoves

Expand Down
File renamed without changes.
7 changes: 7 additions & 0 deletions docs/how-tos/dbt/_category_.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

label: dbt
position: 5
link:
type: generated-index
title: dbt
description: These how to guides cover how to resolve common dbt issues.
2 changes: 1 addition & 1 deletion docs/how-tos/git/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Git
sidebar_position: 70
sidebar_position: 6
---
# How to use Git

Expand Down
4 changes: 2 additions & 2 deletions docs/how-tos/metrics-and-logs/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: Metrics & Logs
sidebar_position: 71
sidebar_position: 7
---
# Metrics and Logs How Tos

Datacoves provides [Grafana](/reference/metrics-and-logs/grafana.md) to monitor Airflow, Docker image builds, and more!
Datacoves provides [Grafana](/reference/datacoves/metrics-and-logs/grafana.md) to monitor Airflow, Docker image builds, and more!

A user must have a Datacoves role with Grafana access. These include, `Datacoves Admin`, `Project Admin`, or `Environment Admin`.

Expand Down
2 changes: 1 addition & 1 deletion docs/how-tos/my_airflow/README.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: My Airflow
sidebar_position: 72
sidebar_position: 8
---
# My Airflow 101

Expand Down
File renamed without changes.
Loading