Less "loading". Re-organize Kafka and Node-RED sections. #464

amotl · 2025-10-30T18:04:41Z

Instead, use "Load", "inserting", and "process events" instead of "data loading".

coderabbitai · 2025-10-30T18:05:08Z

Walkthrough

Documentation updates reorganize and expand Kafka and MQTT integration guides. The Kafka guide adds a Python consumer tutorial with code examples for processing events. The MQTT guide restructures content, adds CrateDB provisioning details, and expands consuming workflows with batching guidance and security notes. A minor wording adjustment appears in performance documentation.

Changes

Cohort / File(s)	Summary
Kafka integration guide `docs/integrate/kafka/docker-python.md`	Renamed section header, added "Process events" section with subsections for submitting events to Kafka and consuming events into CrateDB, includes Python consumer code example (quick_consumer.py) with batching and bulk-insert logic, and setup instructions
MQTT integration guide `docs/integrate/node-red/mqtt-tutorial.md`	Restructured title, removed "Producing data" section, added "Provision CrateDB" section with table creation and OBJECT type guidance, renamed "Consuming and ingesting data" to "Consume messages into CrateDB" with expanded workflow details including batching and multi-value INSERTs, added inline security note on parameterized queries
Performance documentation `docs/performance/inserts/tuning.md`	Changed wording in Calculating statistics subsection from "loading" to "inserting" larger amounts of data

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Documentation restructuring and content reorganization with no logic or control flow changes
Code examples are straightforward (consumer pattern with batching and HTTP inserts)
Minor editorial wording update requires minimal attention

Possibly related PRs

PR Integrate/Kafka: Rework section. Add Python tutorial. #263: Also modifies the Kafka integration guide with Python consumer tutorial additions
PR Integrate: Improve guidance for data import / ingest / load #462: Modifies the same Kafka documentation file with heading reformatting
PR Node-RED: MQTT tutorial #295: Modifies the MQTT tutorial with overlapping changes to producing/consuming workflows and provisioning

Suggested labels

reorganize

Suggested reviewers

kneth
bmunkholm
surister

Poem

🐰 A rabbit hops through docs with glee,
Reorganizing guides so all can see,
Kafka flows and MQTT streams so bright,
With Python consumers and provisioning might,
From produce to consume—the integration's right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The pull request title "Less 'loading'. Re-organize Kafka and Node-RED sections." clearly captures the two main aspects of the changeset. It specifically references reducing use of the "loading" terminology (evident in the wording change from "loading" to "inserting" in the performance tuning docs) and the reorganization of the Kafka and Node-RED integration documentation. The title is concise, specific enough that a teammate scanning history would understand the primary changes, and avoids vague language. It accurately represents the scope and intent of the modifications across all three affected files.
Description Check	✅ Passed	The pull request description directly relates to the changeset by listing two specific objectives with linked commits. The first objective about "Naming things: Use less 'loading' and 'ingesting' in other headlines" aligns with the wording change in the performance tuning documentation. The second objective about "Integrations: Slightly improve / re-organize Kafka and Node-RED sections" aligns with the structural reorganization of both the Kafka and Node-RED documentation files. The description is meaningful and conveys concrete information about what the changes accomplish, rather than being vague or generic.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch less-loading

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ef920c and 4e1f287.

📒 Files selected for processing (3)

docs/integrate/kafka/docker-python.md (2 hunks)
docs/integrate/node-red/mqtt-tutorial.md (3 hunks)
docs/performance/inserts/tuning.md (1 hunks)

🧰 Additional context used

🧠 Learnings (9)

📓 Common learnings

Learnt from: amotl
PR: crate/cratedb-guide#420
File: docs/connect/erlang.md:100-113
Timestamp: 2025-10-19T19:21:49.864Z
Learning: In the cratedb-guide repository, when documenting cloud or alternative connection methods, show only the differences from the canonical example (e.g., SSL configuration) rather than repeating the entire workflow (query execution, cleanup, etc.). This pattern is used consistently across pages to avoid repetition.

Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-10-06T16:40:16.322Z
Learning: In the crate/cratedb-guide repository, documentation is organized by content type: files in `docs/feature/`, `docs/integrate/`, and `docs/ingest/` are reference material designed for quick access, while `docs/start/` contains learning path content. Reference material sections use top-right reference layouts for efficient navigation.

Learnt from: amotl
PR: crate/cratedb-guide#263
File: docs/integrate/kafka/docker-python.md:112-120
Timestamp: 2025-08-22T18:11:12.776Z
Learning: In documentation and tutorial repositories like cratedb-guide, code examples should prioritize simplicity, clarity, and educational value over production-ready features. Comprehensive error handling, extensive validation, and other production concerns can distract from the main learning objectives and make examples harder to follow. Review suggestions should focus on correctness and clarity rather than production hardening.

📚 Learning: 2025-10-19T19:21:49.864Z

Learnt from: amotl
PR: crate/cratedb-guide#420
File: docs/connect/erlang.md:100-113
Timestamp: 2025-10-19T19:21:49.864Z
Learning: In the cratedb-guide repository, when documenting cloud or alternative connection methods, show only the differences from the canonical example (e.g., SSL configuration) rather than repeating the entire workflow (query execution, cleanup, etc.). This pattern is used consistently across pages to avoid repetition.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md

📚 Learning: 2025-08-15T18:56:11.508Z

Learnt from: amotl
PR: crate/cratedb-guide#255
File: docs/integrate/influxdb/tutorial.md:120-123
Timestamp: 2025-08-15T18:56:11.508Z
Learning: In CrateDB integration tutorials, when using `ctk load table` with `--cluster-url="crate://userhost:port/schema/table"`, the verification query should reference the target schema.table combination specified in the cluster URL, not derive it from the source system's naming. For example, if loading from InfluxDB bucket/measurement `testdrive/demo` to CrateDB `doc/testdrive`, the verification query should be `SELECT * FROM doc.testdrive`.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md
docs/integrate/kafka/docker-python.md

📚 Learning: 2025-09-25T19:31:54.320Z

Learnt from: bmunkholm
PR: crate/cratedb-guide#340
File: docs/home/index.md:84-97
Timestamp: 2025-09-25T19:31:54.320Z
Learning: In the CrateDB Guide docs (MyST), the CrateDB Cloud card on the homepage should link to `getting-started` using `:link-type: ref` instead of the previous `cloud:index` intersphinx target. This change was implemented in PR #340 to direct users to the getting started section rather than directly to the Cloud documentation.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md

📚 Learning: 2025-08-14T19:02:43.308Z

Learnt from: amotl
PR: crate/cratedb-guide#253
File: docs/integrate/mindsdb/index.md:21-33
Timestamp: 2025-08-14T19:02:43.308Z
Learning: In CrateDB integration documentation examples, default connection parameters (user="crate", password="", host="127.0.0.1") are intentionally used to provide working out-of-the-box examples for users with local CrateDB instances, rather than using placeholder values that require customization.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md
docs/integrate/kafka/docker-python.md

📚 Learning: 2025-10-06T16:40:16.322Z

Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-10-06T16:40:16.322Z
Learning: In the crate/cratedb-guide repository, documentation is organized by content type: files in `docs/feature/`, `docs/integrate/`, and `docs/ingest/` are reference material designed for quick access, while `docs/start/` contains learning path content. Reference material sections use top-right reference layouts for efficient navigation.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md
docs/integrate/kafka/docker-python.md

📚 Learning: 2025-08-15T22:25:48.433Z

Learnt from: amotl
PR: crate/cratedb-guide#226
File: docs/integrate/queryzen/index.md:51-51
Timestamp: 2025-08-15T22:25:48.433Z
Learning: CrateDB provides a built-in `sys.summits` table that is suitable for demonstration purposes of any kind. This is a system table that can be used in examples and tutorials.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md
docs/integrate/kafka/docker-python.md

📚 Learning: 2025-10-08T01:34:18.867Z

Learnt from: amotl
PR: crate/cratedb-guide#385
File: docs/connect/java.md:48-51
Timestamp: 2025-10-08T01:34:18.867Z
Learning: CrateDB JDBC driver uses the `jdbc:crate://` protocol scheme but communicates via the PostgreSQL wire protocol on port 5432, just like the PostgreSQL JDBC driver (`jdbc:postgresql://`). Do not confuse the `jdbc:crate://` scheme with other protocol schemes like `crate://` (used by SQLAlchemy dialect for CrateDB) or assume it uses the HTTP endpoint on port 4200.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md

📚 Learning: 2025-10-16T10:58:18.401Z

Learnt from: amotl
PR: crate/cratedb-guide#408
File: docs/connect/elixir.md:18-22
Timestamp: 2025-10-16T10:58:18.401Z
Learning: When connecting to CrateDB using PostgreSQL-compatible drivers like Postgrex (Elixir), the `database` parameter handling may differ from standard PostgreSQL usage because CrateDB has no notion of databases like PostgreSQL does. CrateDB uses schemas to separate concerns instead. The `database` parameter might need to be omitted or handled differently to avoid connection issues.

Applied to files:

docs/integrate/node-red/mqtt-tutorial.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build docs

🔇 Additional comments (7)

docs/performance/inserts/tuning.md (1)

120-120: Improved terminology aligns with insert-focused workflows.

The change from "loading" to "inserting" is more precise and aligns with the emphasis on bulk-insert patterns in related integration docs.

docs/integrate/kafka/docker-python.md (2)

97-140: Excellent tutorial example demonstrating batching and idempotent bulk inserts.

The consumer code clearly illustrates key patterns: batching for performance, bulk-insert via HTTP, and using ON CONFLICT for idempotent upserts. Error handling and resource cleanup are appropriate for the tutorial scope.

80-92: Well-structured flow from event submission to consumption.

The new "Process events" section logically separates concerns and complements the provisioning steps above. The newline-delimited JSON format is practical for the docker exec example and simplifies the consumer parsing.

docs/integrate/node-red/mqtt-tutorial.md (4)

2-2: Title change clarifies the focus on consuming/loading data into CrateDB.

The shift from "Ingesting" to "Load" aligns with the PR's terminology goals and better reflects the tutorial's emphasis on the consumption workflow.

26-38: Provision section establishes clear target schema and architectural guidance.

Moving CrateDB setup to the front improves clarity. The OBJECT(DYNAMIC) explanation and reference to partitioning guide provide helpful context for production use.

52-65: Clear workflow steps emphasize batching for efficiency.

The four-step breakdown (HiveMQ → join → function → postgresql) effectively explains the batching strategy. Tuning guidance for array size and timeout is practical and addresses performance/latency tradeoffs.

67-71: Security note strengthens guidance on safe data ingestion.

The explicit recommendation for parameterized queries and warning about escaping is valuable for practitioners building on this example. Positioning after the workflow steps keeps the security context clear.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

amotl added 2 commits October 30, 2025 18:58

Naming things: Use less "loading" and "ingesting" in other headlines

81b8acb

Instead, use "Load", "inserting", and "process events" instead of "data loading".

Integrations: Slightly improve / re-organize Kafka and Node-RED sections

4e1f287

amotl added sanding-1200 Fine sanding. maintenance C'est la vie. labels Oct 30, 2025

amotl merged commit af7d264 into main Oct 30, 2025
3 checks passed

amotl deleted the less-loading branch October 30, 2025 18:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Less "loading". Re-organize Kafka and Node-RED sections. #464

Less "loading". Re-organize Kafka and Node-RED sections. #464

Uh oh!

amotl commented Oct 30, 2025

Uh oh!

coderabbitai bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Less "loading". Re-organize Kafka and Node-RED sections. #464

Less "loading". Re-organize Kafka and Node-RED sections. #464

Uh oh!

Conversation

amotl commented Oct 30, 2025

Uh oh!

coderabbitai bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Oct 30, 2025 •

edited

Loading