Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@
r"https://kubernetes.io/",
# Connection to renenyffenegger.ch timed out.
r"https://renenyffenegger.ch",
# Failed to establish a new connection: [Errno 111] Connection refused
r"https://www.amqp.org/",
]
Comment on lines +64 to 66
Copy link

@coderabbitai coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Broaden AMQP linkcheck ignore to cover non-www, subpaths, and no-trailing-slash variants

The current regex only ignores exactly https://www.amqp.org/. Links like https://amqp.org, https://www.amqp.org (no slash), or any subpage (e.g., https://www.amqp.org/specs/...) will still be checked and may fail with the same connection issue.

Update the pattern to cover both host variants and arbitrary paths.

Apply this diff:

-    r"https://www.amqp.org/",
+    r"https://(www\.)?amqp\.org(/.*)?",

Optional verification script to confirm which AMQP links exist in the docs and ensure they’ll be ignored by the new pattern:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "All occurrences of amqp.org links:"
rg -nH "https?://(www\.)?amqp\.org[^)\s'\"]*" || true

echo
echo "Occurrences specifically missing trailing slash:"
rg -nH "https?://(www\.)?amqp\.org($|[^/])" || true

Length of output: 424


Broaden AMQP linkcheck ignore to cover non-www, subpaths, and no-trailing-slash variants

The repo has AMQP links; the current ignore only matches the exact https://www.amqp.org/ and should be broadened.

Files to note:

  • docs/conf.py:65 — replace the exact pattern
  • docs/integrate/amqp/index.md:53 — contains the AMQP reference link (informational)

Apply this diff:

-    r"https://www.amqp.org/",
+    r"^https://(www\.)?amqp\.org(/.*)?$",

Verification: ran the suggested grep — occurrences found at docs/conf.py:65 and docs/integrate/amqp/index.md:53; no other variants were present.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Failed to establish a new connection: [Errno 111] Connection refused
r"https://www.amqp.org/",
]
# Failed to establish a new connection: [Errno 111] Connection refused
r"^https://(www\.)?amqp\.org(/.*)?$",
]
🤖 Prompt for AI Agents
In docs/conf.py around lines 64 to 66, the AMQP linkcheck ignore currently
matches only the exact "https://www.amqp.org/"; replace that entry with a
broader regex that matches http or https, optional "www.", and optional
subpaths/trailing slash (for example use a raw regex like
^https?://(www\.)?amqp\.org(/.*)?$) so all non-www, subpath, and
no-trailing-slash variants are ignored; no code changes needed in
docs/integrate/amqp/index.md beyond noting the link remains informational.

Copy link
Member Author

@amotl amotl Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: Apparently it's okay. Please do not over-engineer such nitpicks in this case. In general, we are happy about your diligence, so please just understand it is sometimes not applicable, specifically in a repository which is mostly about documentation instead of serious production code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Thank you for that valuable feedback! I completely understand and appreciate the guidance. You're absolutely right that a simple r"https://www.amqp.org/" pattern is perfectly adequate for this documentation context, especially when it's already working as intended.

I'll keep in mind that documentation repositories often benefit from straightforward, practical solutions rather than comprehensive edge-case handling. Your point about balancing diligence with context-appropriate analysis is well taken - not every scenario requires production-level robustness.

Thanks for the constructive feedback while still acknowledging the intent behind the thoroughness. This helps me better calibrate my suggestions to match the repository's actual needs.

 (\   /)
( ._.)

o_(")(") learned something new


✏️ Learnings added
Learnt from: amotl
PR: crate/cratedb-guide#248
File: docs/conf.py:64-66
Timestamp: 2025-08-14T12:52:11.919Z
Learning: In documentation repositories, avoid over-engineering simple solutions like linkcheck ignore patterns. Simple, working patterns are often preferable to comprehensive regex patterns that handle all edge cases, especially when the simpler version is already functioning correctly.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


linkcheck_anchors_ignore_for_url += [
Expand Down
8 changes: 8 additions & 0 deletions docs/ingest/etl/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,14 @@ Load data from database systems.


::::{grid-item-card} {material-outlined}`fast_forward;2em` Streams

- {ref}`amqp`

The AMQP protocol is an open standard application layer protocol for
message-oriented middleware. The defining features of AMQP are message
orientation, queuing, routing (including point-to-point and
publish-and-subscribe), reliability, and security.

- {ref}`kafka`

Apache Kafka is an open-source distributed event streaming platform
Expand Down
53 changes: 53 additions & 0 deletions docs/integrate/amqp/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
(amqp)=
# AMQP

```{div} .float-right
[![AMQP logo](https://www.cleo.com/sites/default/files/styles/desktop_664_270_scale/public/2023-12/amqp-logo.png.webp){width=180px loading=lazy}][AMQP]
```
```{div} .clearfix
```

:::{rubric} About
:::

The [AMQP] protocol is an open standard application layer protocol for
message-oriented middleware. The defining features of AMQP are message
orientation, queuing, routing (including point-to-point and
publish-and-subscribe), reliability, and security.

:::{rubric} Synopsis
:::

Use LorryStream to receive JSON data from an AMQP queue, continuously loading
records into CrateDB.
```shell
uvx --from=lorrystream lorry relay \
"amqp://guest:guest@localhost:5672/%2F?queue=testdrive&content-type=json" \
"crate://localhost/?table=testdrive"
```

:::{rubric} Learn
:::

[LorryStream] is a lightweight and polyglot stream-processing library, used as a
data backplane, message relay, or pipeline subsystem.

::::{grid}

:::{grid-item-card} Tutorial: Use LorryStream
:link: amqp-tutorial
:link-type: ref
How to load data from AMQP into CrateDB using LorryStream.
:::

::::

:::{toctree}
:maxdepth: 1
:hidden:
Tutorial <tutorial>
:::


[LorryStream]: https://lorrystream.readthedocs.io/
[AMQP]: https://www.amqp.org/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the link is correct as Wikipedia also points to it. Currently, the website has a configuration error 🤯.

This comment is safe to ignore 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oha. It worked yesterday, and hopefully it will start working again soon.

-- https://www.amqp.org/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the link checker still needed a fix. 926e28a it is.

96 changes: 96 additions & 0 deletions docs/integrate/amqp/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
(amqp-tutorial)=

# Load data from an AMQP queue into CrateDB

The tutorial will walk you through starting the [RabbitMQ] AMQP broker
and CrateDB, publishing JSON data to an AMQP queue, consuming and relaying
it into a CrateDB table continuously, and validating that the data has
been stored successfully.
The data transfer is supported by the [LorryStream AMQP source] data
pipeline element.

## Prerequisites

Docker is used for running all components. This approach works consistently
across Linux, macOS, and Windows.

Alternatively, you can use Podman. You can also use a different AMQP broker such as
Apache Qpid, Apache ActiveMQ, IBM MQ, or Solace. Azure Event Hubs and Azure Service
Bus speak AMQP as well, but with protocol and authentication specifics; adjust
settings accordingly.

Create a shared network.
```shell
docker network create cratedb-demo
```

Start CrateDB.
```shell
docker run --name=cratedb --rm --network=cratedb-demo \
--publish=4200:4200 --publish=5432:5432 \
--env=CRATE_HEAP_SIZE=2g docker.io/crate -Cdiscovery.type=single-node
```

Start RabbitMQ.
```shell
docker run --name=rabbitmq --rm --network=cratedb-demo \
--publish=5672:5672 docker.io/rabbitmq:3
```
> Note: This broker configuration allows anonymous access for demonstration purposes only.
> Do not expose it to untrusted networks. For production, configure authentication/TLS.
Comment on lines +36 to +41
Copy link

@coderabbitai coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Guest user cannot connect remotely; secure RabbitMQ and avoid exposing 5672 unnecessarily.

By default, RabbitMQ disallows remote connections for the guest user. Also, publishing 5672 to the host isn’t needed for intra-network communication and increases exposure. Create a non-guest user via env vars and remove host port publishing; then update URIs accordingly. Also, adjust the note to reflect credentials/TLS guidance rather than “anonymous access.”

Apply these diffs:

-docker run --name=rabbitmq --rm --network=cratedb-demo \
-  --publish=5672:5672 docker.io/rabbitmq:3
+docker run --name=rabbitmq --rm --network=cratedb-demo \
+  -e RABBITMQ_DEFAULT_USER=crate \
+  -e RABBITMQ_DEFAULT_PASS=crate \
+  -e RABBITMQ_DEFAULT_VHOST=/ \
+  docker.io/rabbitmq:3
-> Note: This broker configuration allows anonymous access for demonstration purposes only.
-> Do not expose it to untrusted networks. For production, configure authentication/TLS.
+> Note: This example uses plaintext credentials and no TLS for demo purposes only.
+> Do not expose the broker to untrusted networks. For production, use non-default
+> credentials, enable TLS, and restrict network access.
-lorry relay \
-  "amqp://guest:guest@rabbitmq:5672/%2F?exchange=default&queue=default&routing-key=testdrive&setup=exchange,queue,bind&content-type=json" \
-  "crate://cratedb/?table=testdrive"
+lorry relay \
+  "amqp://crate:crate@rabbitmq:5672/%2F?exchange=default&queue=default&routing-key=testdrive&setup=exchange,queue,bind&content-type=json" \
+  "crate://cratedb/?table=testdrive"
-echo '{"temperature": 42.84, "humidity": 83.1}' | \
-  amqpcat --producer --uri='amqp://guest:guest@rabbitmq:5672/%2F' \
+echo '{"temperature": 42.84, "humidity": 83.1}' | \
+  amqpcat --producer --uri='amqp://crate:crate@rabbitmq:5672/%2F' \
     --exchange=default --queue=default --routing-key=testdrive

Also applies to: 78-80, 83-87

🤖 Prompt for AI Agents
In docs/integrate/amqp/tutorial.md around lines 36-41 (and also update lines
78-80 and 83-87), the example starts a RabbitMQ container that exposes the guest
user and publishes port 5672; change it to create a non-guest default user via
environment variables (RABBITMQ_DEFAULT_USER and RABBITMQ_DEFAULT_PASS), remove
the host port publishing (-p 5672:5672) so the broker remains accessible only on
the Docker network, and update any example connection URIs to use the new
username/password and the container hostname (e.g., rabbitmq:5672) instead of
localhost:5672; finally, rewrite the note to advise configuring credentials and
TLS for production (remove wording about “anonymous access” and explicitly
recommend non-guest users, strong passwords, and TLS) so the doc reflects secure
defaults.

Copy link
Member Author

@amotl amotl Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: This is a code example / tech demo, with a different focus than on security, like many code snippets in this repository. Please remember to be more lenient about such details in this very context in the future -- thanks!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl, understood! You're absolutely right - in the context of tutorials and tech demos, simplicity and ease of following along should take precedence over security hardening. I'll be more mindful of this balance in educational content, focusing my suggestions on the core functionality rather than production-ready security practices.

The current approach with guest credentials and straightforward setup makes perfect sense for helping users quickly grasp the AMQP integration concepts without getting bogged down in security configuration details.

Thanks for the clarification about the repository's documentation philosophy!


✏️ Learnings added
Learnt from: amotl
PR: crate/cratedb-guide#248
File: docs/integrate/amqp/tutorial.md:36-41
Timestamp: 2025-08-14T12:42:12.723Z
Learning: In the cratedb-guide repository, code examples and tutorials prioritize simplicity and ease of following over security hardening. Security recommendations should be more lenient in tutorial/demo contexts to avoid distracting from the main learning objectives.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Prepare shortcuts for the CrateDB shell, LorryStream, and the AMQP client
programs.

::::{tab-set}

:::{tab-item} Linux and macOS
To make the settings persistent, add them to your shell profile (e.g., `~/.profile` or `~/.zshrc`).
```shell
alias crash="docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash"
alias lorry="docker run --rm -i --network=cratedb-demo ghcr.io/daq-tools/lorrystream lorry"
alias amqpcat="docker run --rm -i --network=cratedb-demo docker.io/cloudamqp/amqpcat amqpcat"
```
:::
:::{tab-item} Windows PowerShell
To make the settings persistent, add them to your PowerShell profile (`$PROFILE`).
```powershell
function crash { docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash @args }
function lorry { docker run --rm -i --network=cratedb-demo ghcr.io/daq-tools/lorrystream lorry @args }
function amqpcat { docker run --rm -i --network=cratedb-demo docker.io/cloudamqp/amqpcat amqpcat @args }
```
:::
:::{tab-item} Windows Command
```shell
doskey crash=docker run --rm -it --network=cratedb-demo ghcr.io/crate/cratedb-toolkit crash $*
doskey lorry=docker run --rm -i --network=cratedb-demo ghcr.io/daq-tools/lorrystream lorry $*
doskey amqpcat=docker run --rm -i --network=cratedb-demo docker.io/cloudamqp/amqpcat amqpcat $*
```
:::

::::

## Usage

Invoke the data transfer pipeline.
```shell
lorry relay \
"amqp://guest:guest@rabbitmq:5672/%2F?exchange=default&queue=default&routing-key=testdrive&setup=exchange,queue,bind&content-type=json" \
"crate://cratedb/?table=testdrive"
```

Publish a JSON message to AMQP.
```shell
echo '{"temperature": 42.84, "humidity": 83.1}' | \
amqpcat --producer --uri='amqp://guest:guest@rabbitmq:5672/%2F' \
--exchange=default --queue=default --routing-key=testdrive
```

Inspect data stored in CrateDB.
```shell
crash --hosts cratedb -c "SELECT * FROM testdrive"
```


[LorryStream AMQP source]: https://lorrystream.readthedocs.io/source/amqp.html
[RabbitMQ]: https://www.rabbitmq.com/
1 change: 1 addition & 0 deletions docs/integrate/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Please also visit the [Overview of CrateDB integration tutorials].
:maxdepth: 1

airflow/index
amqp/index
aws-lambda/index
azure-functions/index
cluvio/index
Expand Down
1 change: 1 addition & 0 deletions docs/integrate/mongodb/tutorial.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
(mongodb-tutorial)=
(migrating-mongodb)=
# Import data from MongoDB

In this quick tutorial, you'll use the [CrateDB Toolkit MongoDB I/O subsystem]
Expand Down
Loading