-
Notifications
You must be signed in to change notification settings - Fork 1
Kestra: Usage guide #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kestra: Usage guide #291
Conversation
WalkthroughUpdated Kestra integration docs: replaced an external grid-item link with an internal cross-reference in the index and added a hidden toctree entry. Added a new usage guide describing end-to-end CrateDB ↔ Kestra integration using the PostgreSQL plugin with examples and execution/monitoring steps. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Kestra as Kestra UI/Executor
participant Plugin as Kestra PostgreSQL Plugin
participant CrateA as CrateDB Cluster A
participant CrateB as CrateDB Cluster B
User->>Kestra: Submit Flow (YAML)
Kestra->>Plugin: Run query task
Plugin->>CrateA: SELECT / export data
Plugin-->>Kestra: Result / status
Kestra->>Plugin: Run batch insert task
Plugin->>CrateB: INSERT data
Plugin-->>Kestra: Acknowledgement
Kestra-->>User: Logs, status, metrics
Note over Kestra,Plugin: Errors and execution details surfaced in Logs view
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
docs/integrate/kestra/tutorial.md (9)
4-4: Use https and tighten wording (active voice).Apply:
-[Kestra.io](http://kestra.io/) is an open-source workflow automation and orchestration tool that enables users to automate and manage complex workflows in a streamlined and efficient manner. The tool provides a wide range of features and integrations, including Postgres, Git, Docker, Kubernetes, and more, making automating processes across different platforms and environments easy. Kestra comes with a user-friendly web-based interface, allowing users to create, modify, and manage workflows ***without the need for any coding skills***. +[Kestra.io](https://kestra.io/) is an open‑source workflow automation and orchestration tool that helps you automate and manage complex workflows efficiently. It integrates with Postgres, Git, Docker, Kubernetes, and more. Kestra’s web UI lets you create, modify, and manage workflows without writing code.
10-15: Convert inline command to fenced code block with language.Fixes MD038/MD040 and improves readability.
Apply:
-`docker run -d -p 8080:8080 kestra/kestra:latest ` +```bash +docker run -d -p 8080:8080 kestra/kestra:latest +```
69-80: Add code fence language and avoid embedding secrets.Declare as YAML and reference a secret for the password.
Apply:
-``` +```yaml id: cratedb-kestra namespace: io.kestra.crate tasks: - id: query type: io.kestra.plugin.jdbc.postgresql.Query url: jdbc:postgresql://cratedb-kestra.aks1.westeurope.azure.cratedb.net:5432/ username: admin - password: my_super_secret_password + password: "{{ secret('CRATEDB_PASSWORD') }}" sql: SELECT * FROM doc.nyc_taxi LIMIT 1000 store: true--- `83-83`: **Grammar nit.** Apply: ```diff -In this task, we set the `store` parameter is set to `true` to allow storing the results that will be used as input in the following task. +In this task, we set the `store` parameter to `true` to store results for the next task.
91-100: Add YAML language and use secrets for credentials.Apply:
-``` +```yaml - id: update type: io.kestra.plugin.jdbc.postgresql.Batch from: "{{ outputs.query.uri }}" url: jdbc:postgresql://cratedb-kestra2.aks1.eastus2.azure.cratedb.net:5432/ username: admin - password: my_super_secret_password + password: "{{ secret('CRATEDB_PASSWORD') }}" sql: | INSERT INTO doc.nyc_taxi VALUES( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )--- `108-116`: **Add YAML language for the third snippet.** Apply: ```diff -``` +```yaml - id: select type: io.kestra.plugin.jdbc.postgresql.Query url: jdbc:postgresql://kestra-testing-cluster2.aks1.eastus2.azure.cratedb.net:5432/ username: admin password: my_super_secret_password sql: SELECT MAX_BY(passenger_count, fare_amount) FROM doc.nyc_taxi store: false--- `21-21`: **Avoid time‑sensitive promo amounts.** Either qualify with a date or generalize to “free trial credit.” Apply: ```diff -To deploy a new cluster on CrateDB Cloud, you need to sign up for a [CrateDB Cloud account](https://console.cratedb.cloud/). When creating a new organization, you are entitled to a [$200 free credit](https://crate.io/lp-free-trial) to spend on cluster deployment, scaling, and other operations as you see fit. Once you've signed up, you can create a new cluster by selecting the *Create Cluster* button and choosing your preferred cloud provider and region. You can then configure your cluster by selecting the number of nodes and the amount of storage you need. In this example, we used the 1-node cluster with 4GiB of storage which is enough for development environments and low-traffic applications. +To deploy a new cluster on CrateDB Cloud, sign up for a [CrateDB Cloud account](https://console.cratedb.cloud/). New organizations receive a free trial credit (as of September 2025) for cluster deployment, scaling, and other operations. After signing up, create a cluster by selecting *Create Cluster* and choosing your preferred cloud provider and region. In this example, we use a 1‑node cluster with 4 GiB of storage, sufficient for development and low‑traffic applications.
16-16: Improve image alt text (a11y) and sizing.The Discourse export alt text is non‑descriptive. Use meaningful alt and explicit size attributes.
Apply (and mirror for similar images at Lines 23, 124, 131):
- +{width=690px height=290px}
136-136: Use https and tighten wrap‑up sentence (active voice).Apply:
-If you need to automatically manage CrateDB data pipelines, [kestra.io](http://kestra.io/) can be a good choice. It allows you to specify workflows without requiring coding skills. Furthermore, it supports integrations with various systems including Postgres (and CrateDB), Kubernetes, Docker, Git, and many others. +If you need to automatically manage CrateDB data pipelines, [kestra.io](https://kestra.io/) is a good choice. It lets you define workflows without writing code and integrates with Postgres (and CrateDB), Kubernetes, Docker, Git, and more.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/integrate/kestra/index.md(2 hunks)docs/integrate/kestra/tutorial.md(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#263
File: docs/integrate/kafka/docker-python.md:112-120
Timestamp: 2025-08-22T18:11:12.776Z
Learning: In documentation and tutorial repositories like cratedb-guide, code examples should prioritize simplicity, clarity, and educational value over production-ready features. Comprehensive error handling, extensive validation, and other production concerns can distract from the main learning objectives and make examples harder to follow. Review suggestions should focus on correctness and clarity rather than production hardening.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Learnt from: amotl
PR: crate/cratedb-guide#235
File: docs/start/going-further.md:20-38
Timestamp: 2025-08-23T04:22:52.597Z
Learning: In the CrateDB Guide repository, the prefix `crate-reference:` in documentation links (like `crate-reference:index`) is an intersphinx reference that points to external content in the CrateDB Reference Manual documentation. These references are intentionally not pointing to local files within the repository.
📚 Learning: 2025-05-18T12:50:38.681Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:8-20
Timestamp: 2025-05-18T12:50:38.681Z
Learning: In the CrateDB guide repository, references with the `ctk:` prefix (like `ctk:query/mcp/landscape`) are intersphinx references that link to resources in the cratedb-toolkit repository (https://github.com/crate/cratedb-toolkit/tree/main/doc), which are rendered at https://cratedb-toolkit.readthedocs.io/. These are valid cross-references between separate Sphinx documentation sets, not local file references.
Applied to files:
docs/integrate/kestra/index.md
📚 Learning: 2025-05-18T12:50:36.393Z
Learnt from: amotl
PR: crate/cratedb-guide#204
File: docs/integrate/mcp/community.md:22-33
Timestamp: 2025-05-18T12:50:36.393Z
Learning: In the CrateDB Guide repository, the prefix `ctk:` in documentation links (like `ctk:query/mcp/server`) is an intersphinx reference that points to external content in the CrateDB Toolkit documentation at https://cratedb-toolkit.readthedocs.io/. These references are intentionally not pointing to local files within the repository.
Applied to files:
docs/integrate/kestra/index.md
🪛 markdownlint-cli2 (0.17.2)
docs/integrate/kestra/tutorial.md
12-12: Spaces inside code span elements
(MD038, no-space-in-code)
69-69: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
91-91: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
108-108: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
🔇 Additional comments (3)
docs/integrate/kestra/index.md (2)
39-41: Cross‑ref looks correct; nice switch to internal link.Label
(kestra-tutorial)=exists in tutorial.md, so:link-type: reftokestra-tutorialshould resolve.Please confirm the docs build and linkcheck pass for this ref.
54-58: Hidden toctree is appropriate.Including
tutorialvia a hidden toctree keeps it in nav without duplicating visible content. LGTM.docs/integrate/kestra/tutorial.md (1)
69-69: Add fenced code languages to satisfy markdownlint MD040.You already did for SQL; add
yamlto these three blocks.Run markdownlint locally to confirm no remaining MD038/MD040 hits.
Also applies to: 91-91, 108-108
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (18)
docs/integrate/kestra/usage.md (18)
4-8: Consider converting passive voice to active voice.The phrase "is an open‑source workflow automation" uses passive construction. Per the PR objectives requesting active voice conversion, consider rephrasing.
Example revision:
-[Kestra.io](https://kestra.io/) is an open‑source workflow automation and -orchestration tool that helps you automate and manage complex workflows -efficiently. It integrates with Postgres, Git, Docker, Kubernetes, and -more. Kestra's web UI lets you create, modify, and manage workflows -without writing code. +[Kestra.io](https://kestra.io/) provides open‑source workflow automation and +orchestration. You can use it to automate and manage complex workflows +efficiently. It integrates with Postgres, Git, Docker, Kubernetes, and +more. The web UI lets you create, modify, and manage workflows +without writing code.
14-14: Minor: Consider more active phrasing."Getting started with Kestra using Docker is a straightforward process" could be more direct.
Consider:
-Getting started with Kestra using Docker is a straightforward process. First, you'll need to install Docker on your machine, if you haven't already. +You can quickly start Kestra using Docker. First, install Docker on your machine if you haven't already.
20-20: Fix passive voice construction."which you can access by navigating" contains passive elements.
-This will start the Kestra server on your local machine, which you can access by navigating to [http://localhost:8080](http://localhost:8080/) in your web browser. +This starts the Kestra server on your local machine. Access it by navigating to [http://localhost:8080](http://localhost:8080/) in your web browser.
38-38: Remove passive voice construction."you can start using" can be made more direct.
-Once your cluster is up and running, you can start using CrateDB's powerful distributed SQL database features via a web-based Admin UI. +Once your cluster is up and running, use CrateDB's powerful distributed SQL database features via the web-based Admin UI.
42-42: Fix passive voice: "you can use to move".Passive construction can be simplified.
-There are several ways you can use to move data between CrateDB clusters and in the following example, we will illustrate how to simply do this with Kestra. +You can move data between CrateDB clusters in several ways. The following example shows how to do this with Kestra.
44-44: Clarify the navigation instruction."go back to the cluster overview page" assumes context that may not be clear.
-Now, let's import some data on the first cluster. To do so, go back to the cluster overview page and click on *Learn how to import data* link. +Now, let's import some data on the first cluster. Navigate to the cluster overview page and click the *Learn how to import data* link.
73-73: Fix passive voice: "are used to implement".-Flows in Kestra are used to implement workflows. +Flows in Kestra implement workflows.
73-73: Fix passive voice: "is defined as".-Each flow is defined as a declarative model in the YAML file and it contains all the tasks and the order in which the tasks will be run. +Each flow uses a declarative YAML model and contains all the tasks in the order they will run.
87-89: Security concern: Hardcoded credentials in documentation.While this is example code, consider using placeholders or environment variables to demonstrate security best practices.
- url: jdbc:postgresql://cratedb-kestra.aks1.westeurope.azure.cratedb.net:5432/ - username: admin - password: my_super_secret_password + url: jdbc:postgresql://YOUR_CLUSTER_URL:5432/ + username: ${CRATEDB_USERNAME} + password: ${CRATEDB_PASSWORD}Add a note explaining that users should replace these with their actual credentials or use Kestra's secret management.
107-109: Apply consistent credential security practice.Same as the previous task, use placeholders for credentials.
- url: jdbc:postgresql://cratedb-kestra2.aks1.eastus2.azure.cratedb.net:5432/ - username: admin - password: my_super_secret_password + url: jdbc:postgresql://YOUR_SECOND_CLUSTER_URL:5432/ + username: ${CRATEDB_USERNAME} + password: ${CRATEDB_PASSWORD}
114-116: Remove passive voice: "is used to capture".-The `output` object is used to capture information about the results of the task, including any resources that were created or modified. +The `output` object captures information about the task results, including any resources created or modified.
114-114: Fix passive voice: "may create or modify" and "were created".-When a Kestra task is executed, it may create or modify a resource, such as a file, database record, or API endpoint. +When a Kestra task executes, it may create or modify a resource, such as a file, database record, or API endpoint.
116-116: Fix passive voice: "that was created".-In our example, the `output.query.uri` refers to the URI of the resource that was created by the previous task. +In our example, `output.query.uri` refers to the URI of the resource the previous task created.
123-125: Apply consistent credential security practice.Use placeholders for credentials in the third task as well.
- url: jdbc:postgresql://kestra-testing-cluster2.aks1.eastus2.azure.cratedb.net:5432/ - username: admin - password: my_super_secret_password + url: jdbc:postgresql://YOUR_SECOND_CLUSTER_URL:5432/ + username: ${CRATEDB_USERNAME} + password: ${CRATEDB_PASSWORD}
130-130: Fix passive voice: "is the highest" and "is one of" and "supported by".-In the last task, we select the `passenger_count` value for which the `fare_amount` is the highest and to achieve that we use the `MAX_BY` aggregation function. `MAX_BY` is one of the latest aggregation functions supported by CrateDB and to learn more about it, check out our [recent blog post](https://crate.io/blog/find-the-latest-reported-values-with-ease.-introducing-max_by-and-min_by-aggregations-in-cratedb-5.2). +In the last task, we select the `passenger_count` value with the highest `fare_amount` using the `MAX_BY` aggregation function. `MAX_BY` ranks among CrateDB's latest aggregation functions. Learn more in our [recent blog post](https://crate.io/blog/find-the-latest-reported-values-with-ease.-introducing-max_by-and-min_by-aggregations-in-cratedb-5.2).
141-141: Fix passive voice: "got inserted".-Finally, let's check the data in the second cluster. As illustrated below, we can see that exactly 1000 records got inserted: +Finally, let's check the data in the second cluster. As illustrated below, exactly 1000 records were inserted:
148-148: Remove passive voice: "is a good choice".-If you need to automatically manage CrateDB data pipelines, [kestra.io](https://kestra.io/) is a good choice. +If you need to automatically manage CrateDB data pipelines, [kestra.io](https://kestra.io/) provides a strong solution.
150-150: Check for broken or tracking-laden URL.The URL contains multiple tracking parameters (
utm_campaign,utm_source,utm_medium,hsCtaTracking) which may not be appropriate for documentation and could break over time.Consider using a clean URL without tracking parameters:
-In this usage guide, we have also shown how to deploy your CrateDB cluster in a few clicks. If you want to try it out and enjoy all of the CrateDB features, sign up for the [CrateDB Cloud](https://console.cratedb.cloud/?utm_campaign=2022-Q2-WS-Free-Trial&utm_source=website&utm_medium=free-trial-overhaul&hsCtaTracking=a7e2a487-cfb9-4a50-8e75-3029b9e176fb%7C7863166c-05e4-4334-9dd5-58dfdd6e78c1) trial. +In this usage guide, we have also shown how to deploy your CrateDB cluster in a few clicks. If you want to try it out and enjoy all of the CrateDB features, sign up for the [CrateDB Cloud](https://console.cratedb.cloud/) trial.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/integrate/kestra/index.md(2 hunks)docs/integrate/kestra/usage.md(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: docs
docs/integrate/kestra/index.md
[error] 1-1: Command 'cd docs && make check' failed. Sphinx build finished with problems (3 warnings). Exit code: 2.
docs/integrate/kestra/usage.md
[error] 1-1: Command 'cd docs && make check' failed. Sphinx build finished with problems (3 warnings). Exit code: 2.
🔇 Additional comments (13)
docs/integrate/kestra/index.md (3)
40-41: LGTM! Link correctly references the new usage page.The change from an external URL to an internal cross-reference using
kestra-usageis appropriate and correctly links to the new usage documentation.
54-58: LGTM! Toctree structure is correct.The hidden toctree with
maxdepth: 1properly includes the usage page in the documentation navigation structure.
1-1: Investigate Sphinx build warnings causing pipeline failure. The CI logs report three warnings that must be fixed before merging. To reproduce and capture them locally, from the repo root install docs’ requirements and run:pip install -r docs/requirements.txt python -m sphinx -b html docs docs/_build/html 2>&1 | tee sphinx_build.log grep -i "warning" sphinx_build.logShare the exact warning messages so they can be addressed.
docs/integrate/kestra/usage.md (10)
1-2: LGTM! Target anchor correctly defined.The
(kestra-usage)=anchor properly defines the cross-reference target that's used in the index page.
17-17: LGTM! Docker command is correct.The Docker command properly starts Kestra with port mapping.
29-29: Verify the current date reference.The text references "as of September 2025" which appears to be the current month. Verify this is intentional and update if needed to avoid the text becoming stale quickly.
47-64: LGTM! SQL DDL is correct.The
CREATE TABLEstatement fornyc_taxiis properly formatted with appropriate column types and table settings.
67-69: LGTM! COPY statement is correct.The
COPYcommand properly references the S3 sample data with gzip compression.
81-92: LGTM! Kestra query task configuration is correct.The YAML configuration properly defines the PostgreSQL query task with appropriate parameters.
104-112: LGTM! Batch task configuration is correct.The batch insert task properly references the previous task's output and uses parameterized SQL.
126-126: LGTM! MAX_BY usage is correct.The SQL query correctly demonstrates CrateDB's
MAX_BYaggregation function.
152-152: LGTM! Community link is appropriate.The link to the CrateDB community forum is relevant and well-placed.
22-22: Consider adding alt text for accessibility.The image lacks alternative text, which is important for screen readers and accessibility compliance.
-{width=690px height=290px} +{width=690px height=290px}Likely an incorrect or invalid review comment.
About
Continue adding integration guides from the community forum.
Preview
References
Trivia
@coderabbitai: Please review and convert to active voice.