Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions docs/integrate/kafka/docker-python.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ docker compose up -d
* CrateDB Admin UI: `http://localhost:4200`
* Kafka broker (inside-compose hostname): kafka:9092

### Create a demo table in CrateDB
### Create a CrateDB table

The easiest way to do this is through the CrateDB Admin UI at `http://localhost:4200` and execute this using the console:

Expand All @@ -69,26 +69,30 @@ But this can also be done using `curl`:
curl -sS -H 'Content-Type: application/json' -X POST http://localhost:4200/_sql -d '{"stmt":"CREATE TABLE IF NOT EXISTS sensor_readings (device_id TEXT, ts TIMESTAMPTZ, temperature DOUBLE PRECISION, humidity DOUBLE PRECISION, PRIMARY KEY (device_id, ts))"}'
```

### Create a Kafka topic and send a couple of messages
### Create a Kafka topic

Creating a Kafka topic can be done in several ways, we are selecting to use
`docker exec` in this way:

```bash
docker exec -it kafka kafka-topics.sh --create --topic sensors --bootstrap-server kafka:9092 --partitions 3 --replication-factor 1
```

## Process events

### Submit events to Kafka
```bash
docker exec -it kafka kafka-console-producer.sh --bootstrap-server kafka:9092 --topic sensors <<'EOF'
{"device_id":"alpha","ts":"2025-08-19T12:00:00Z","temperature":21.4,"humidity":48.0}
{"device_id":"alpha","ts":"2025-08-19T12:01:00Z","temperature":21.5,"humidity":47.6}
{"device_id":"beta","ts":"2025-08-19T12:00:00Z","temperature":19.8,"humidity":55.1}
EOF
```
Events (messages) are newline-delimited JSON for simplicity.

Messages are newline-delimited JSON for simplicity.

## Data loading
### Consume events into CrateDB

Create a simple consumer using Python.
Create a simple consumer application using Python. It consumes events from the
Kafka topic and inserts them into the CrateDB database table.

```python
# quick_consumer.py
Expand Down
30 changes: 16 additions & 14 deletions docs/integrate/node-red/mqtt-tutorial.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
(node-red-tutorial)=
# Ingesting MQTT messages into CrateDB using Node-RED
# Load MQTT messages into CrateDB using Node-RED

:::{article-info}
---
Expand All @@ -23,19 +23,7 @@ You need:
2. The [node-red-contrib-postgresql](https://github.com/alexandrainst/node-red-contrib-postgresql) module installed.
3. A running MQTT broker. This tutorial uses [HiveMQ Cloud](https://www.hivemq.com/).

## Producing data

First, generate data to populate the MQTT topic with Node-RED. If you already
have an MQTT topic with regular messages, you can skip this part.
![Screenshot 2021-09-13 at 14.58.42|690x134, 50%](https://us1.discourse-cdn.com/flex020/uploads/crate/original/1X/5722946039148ca6ce69702d963f9f842c4f972c.png){width=480px}

The `inject` node creates a JSON payload with three attributes:
![Screenshot 2021-09-13 at 14.56.42|690x293, 50%](https://us1.discourse-cdn.com/flex020/uploads/crate/original/1X/8084a53e544d681e79f85d780c621a340a7d0d30.png){width=480px}

In this example, two fields are static; only the timestamp changes.
Download the full workflow definition: [flows-producer.json](https://community.cratedb.com/uploads/short-url/eOvAk3XzDkRbNZjcZV0pZ0SnGu4.json) (1.3 KB)

## Consuming and ingesting data
## Provision CrateDB

First of all, we create the target table in CrateDB:
```sql
Expand All @@ -49,6 +37,20 @@ Store the payload as CrateDB’s {ref}`OBJECT data type
<crate-reference:type-object>` to accommodate an evolving schema.
For production, also consider the {ref}`partitioning and sharding guide <sharding-partitioning>`.

## Publish messages to MQTT

First, generate data to populate the MQTT topic with Node-RED. If you already
have an MQTT topic with regular messages, you can skip this part.
![Screenshot 2021-09-13 at 14.58.42|690x134, 50%](https://us1.discourse-cdn.com/flex020/uploads/crate/original/1X/5722946039148ca6ce69702d963f9f842c4f972c.png){width=480px}

The `inject` node creates a JSON payload with three attributes:
![Screenshot 2021-09-13 at 14.56.42|690x293, 50%](https://us1.discourse-cdn.com/flex020/uploads/crate/original/1X/8084a53e544d681e79f85d780c621a340a7d0d30.png){width=480px}

In this example, two fields are static; only the timestamp changes.
Download the full workflow definition: [flows-producer.json](https://community.cratedb.com/uploads/short-url/eOvAk3XzDkRbNZjcZV0pZ0SnGu4.json) (1.3 KB)

## Consume messages into CrateDB

To ingest efficiently, group messages into batches and use
{ref}`multi-value INSERT statements <inserts-multiple-values>`
to avoid generating one INSERT per message:
Expand Down
2 changes: 1 addition & 1 deletion docs/performance/inserts/tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ value.

### Calculating statistics

After loading larger amounts of data into new or existing tables, it is
After inserting larger amounts of data into new or existing tables, it is
recommended to re-calculate the statistics by executing the `ANALYZE`
command. The statistics will be used by the query optimizer to generate
better execution plans.
Expand Down