Skip to content

Conversation

@amotl amotl added sanding-1200 Fine sanding. maintenance C'est la vie. labels Oct 30, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 30, 2025

Walkthrough

Documentation updates reorganize and expand Kafka and MQTT integration guides. The Kafka guide adds a Python consumer tutorial with code examples for processing events. The MQTT guide restructures content, adds CrateDB provisioning details, and expands consuming workflows with batching guidance and security notes. A minor wording adjustment appears in performance documentation.

Changes

Cohort / File(s) Summary
Kafka integration guide
docs/integrate/kafka/docker-python.md
Renamed section header, added "Process events" section with subsections for submitting events to Kafka and consuming events into CrateDB, includes Python consumer code example (quick_consumer.py) with batching and bulk-insert logic, and setup instructions
MQTT integration guide
docs/integrate/node-red/mqtt-tutorial.md
Restructured title, removed "Producing data" section, added "Provision CrateDB" section with table creation and OBJECT type guidance, renamed "Consuming and ingesting data" to "Consume messages into CrateDB" with expanded workflow details including batching and multi-value INSERTs, added inline security note on parameterized queries
Performance documentation
docs/performance/inserts/tuning.md
Changed wording in Calculating statistics subsection from "loading" to "inserting" larger amounts of data

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Documentation restructuring and content reorganization with no logic or control flow changes
  • Code examples are straightforward (consumer pattern with batching and HTTP inserts)
  • Minor editorial wording update requires minimal attention

Possibly related PRs

Suggested labels

reorganize

Suggested reviewers

  • kneth
  • bmunkholm
  • surister

Poem

🐰 A rabbit hops through docs with glee,
Reorganizing guides so all can see,
Kafka flows and MQTT streams so bright,
With Python consumers and provisioning might,
From produce to consume—the integration's right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Less 'loading'. Re-organize Kafka and Node-RED sections." clearly captures the two main aspects of the changeset. It specifically references reducing use of the "loading" terminology (evident in the wording change from "loading" to "inserting" in the performance tuning docs) and the reorganization of the Kafka and Node-RED integration documentation. The title is concise, specific enough that a teammate scanning history would understand the primary changes, and avoids vague language. It accurately represents the scope and intent of the modifications across all three affected files.
Description Check ✅ Passed The pull request description directly relates to the changeset by listing two specific objectives with linked commits. The first objective about "Naming things: Use less 'loading' and 'ingesting' in other headlines" aligns with the wording change in the performance tuning documentation. The second objective about "Integrations: Slightly improve / re-organize Kafka and Node-RED sections" aligns with the structural reorganization of both the Kafka and Node-RED documentation files. The description is meaningful and conveys concrete information about what the changes accomplish, rather than being vague or generic.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch less-loading

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ef920c and 4e1f287.

📒 Files selected for processing (3)
  • docs/integrate/kafka/docker-python.md (2 hunks)
  • docs/integrate/node-red/mqtt-tutorial.md (3 hunks)
  • docs/performance/inserts/tuning.md (1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-guide#420
File: docs/connect/erlang.md:100-113
Timestamp: 2025-10-19T19:21:49.864Z
Learning: In the cratedb-guide repository, when documenting cloud or alternative connection methods, show only the differences from the canonical example (e.g., SSL configuration) rather than repeating the entire workflow (query execution, cleanup, etc.). This pattern is used consistently across pages to avoid repetition.
Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-10-06T16:40:16.322Z
Learning: In the crate/cratedb-guide repository, documentation is organized by content type: files in `docs/feature/`, `docs/integrate/`, and `docs/ingest/` are reference material designed for quick access, while `docs/start/` contains learning path content. Reference material sections use top-right reference layouts for efficient navigation.
Learnt from: amotl
PR: crate/cratedb-guide#263
File: docs/integrate/kafka/docker-python.md:112-120
Timestamp: 2025-08-22T18:11:12.776Z
Learning: In documentation and tutorial repositories like cratedb-guide, code examples should prioritize simplicity, clarity, and educational value over production-ready features. Comprehensive error handling, extensive validation, and other production concerns can distract from the main learning objectives and make examples harder to follow. Review suggestions should focus on correctness and clarity rather than production hardening.
📚 Learning: 2025-10-19T19:21:49.864Z
Learnt from: amotl
PR: crate/cratedb-guide#420
File: docs/connect/erlang.md:100-113
Timestamp: 2025-10-19T19:21:49.864Z
Learning: In the cratedb-guide repository, when documenting cloud or alternative connection methods, show only the differences from the canonical example (e.g., SSL configuration) rather than repeating the entire workflow (query execution, cleanup, etc.). This pattern is used consistently across pages to avoid repetition.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
📚 Learning: 2025-08-15T18:56:11.508Z
Learnt from: amotl
PR: crate/cratedb-guide#255
File: docs/integrate/influxdb/tutorial.md:120-123
Timestamp: 2025-08-15T18:56:11.508Z
Learning: In CrateDB integration tutorials, when using `ctk load table` with `--cluster-url="crate://userhost:port/schema/table"`, the verification query should reference the target schema.table combination specified in the cluster URL, not derive it from the source system's naming. For example, if loading from InfluxDB bucket/measurement `testdrive/demo` to CrateDB `doc/testdrive`, the verification query should be `SELECT * FROM doc.testdrive`.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
  • docs/integrate/kafka/docker-python.md
📚 Learning: 2025-09-25T19:31:54.320Z
Learnt from: bmunkholm
PR: crate/cratedb-guide#340
File: docs/home/index.md:84-97
Timestamp: 2025-09-25T19:31:54.320Z
Learning: In the CrateDB Guide docs (MyST), the CrateDB Cloud card on the homepage should link to `getting-started` using `:link-type: ref` instead of the previous `cloud:index` intersphinx target. This change was implemented in PR #340 to direct users to the getting started section rather than directly to the Cloud documentation.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
📚 Learning: 2025-08-14T19:02:43.308Z
Learnt from: amotl
PR: crate/cratedb-guide#253
File: docs/integrate/mindsdb/index.md:21-33
Timestamp: 2025-08-14T19:02:43.308Z
Learning: In CrateDB integration documentation examples, default connection parameters (user="crate", password="", host="127.0.0.1") are intentionally used to provide working out-of-the-box examples for users with local CrateDB instances, rather than using placeholder values that require customization.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
  • docs/integrate/kafka/docker-python.md
📚 Learning: 2025-10-06T16:40:16.322Z
Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-10-06T16:40:16.322Z
Learning: In the crate/cratedb-guide repository, documentation is organized by content type: files in `docs/feature/`, `docs/integrate/`, and `docs/ingest/` are reference material designed for quick access, while `docs/start/` contains learning path content. Reference material sections use top-right reference layouts for efficient navigation.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
  • docs/integrate/kafka/docker-python.md
📚 Learning: 2025-08-15T22:25:48.433Z
Learnt from: amotl
PR: crate/cratedb-guide#226
File: docs/integrate/queryzen/index.md:51-51
Timestamp: 2025-08-15T22:25:48.433Z
Learning: CrateDB provides a built-in `sys.summits` table that is suitable for demonstration purposes of any kind. This is a system table that can be used in examples and tutorials.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
  • docs/integrate/kafka/docker-python.md
📚 Learning: 2025-10-08T01:34:18.867Z
Learnt from: amotl
PR: crate/cratedb-guide#385
File: docs/connect/java.md:48-51
Timestamp: 2025-10-08T01:34:18.867Z
Learning: CrateDB JDBC driver uses the `jdbc:crate://` protocol scheme but communicates via the PostgreSQL wire protocol on port 5432, just like the PostgreSQL JDBC driver (`jdbc:postgresql://`). Do not confuse the `jdbc:crate://` scheme with other protocol schemes like `crate://` (used by SQLAlchemy dialect for CrateDB) or assume it uses the HTTP endpoint on port 4200.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
📚 Learning: 2025-10-16T10:58:18.401Z
Learnt from: amotl
PR: crate/cratedb-guide#408
File: docs/connect/elixir.md:18-22
Timestamp: 2025-10-16T10:58:18.401Z
Learning: When connecting to CrateDB using PostgreSQL-compatible drivers like Postgrex (Elixir), the `database` parameter handling may differ from standard PostgreSQL usage because CrateDB has no notion of databases like PostgreSQL does. CrateDB uses schemas to separate concerns instead. The `database` parameter might need to be omitted or handled differently to avoid connection issues.

Applied to files:

  • docs/integrate/node-red/mqtt-tutorial.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (7)
docs/performance/inserts/tuning.md (1)

120-120: Improved terminology aligns with insert-focused workflows.

The change from "loading" to "inserting" is more precise and aligns with the emphasis on bulk-insert patterns in related integration docs.

docs/integrate/kafka/docker-python.md (2)

97-140: Excellent tutorial example demonstrating batching and idempotent bulk inserts.

The consumer code clearly illustrates key patterns: batching for performance, bulk-insert via HTTP, and using ON CONFLICT for idempotent upserts. Error handling and resource cleanup are appropriate for the tutorial scope.


80-92: Well-structured flow from event submission to consumption.

The new "Process events" section logically separates concerns and complements the provisioning steps above. The newline-delimited JSON format is practical for the docker exec example and simplifies the consumer parsing.

docs/integrate/node-red/mqtt-tutorial.md (4)

2-2: Title change clarifies the focus on consuming/loading data into CrateDB.

The shift from "Ingesting" to "Load" aligns with the PR's terminology goals and better reflects the tutorial's emphasis on the consumption workflow.


26-38: Provision section establishes clear target schema and architectural guidance.

Moving CrateDB setup to the front improves clarity. The OBJECT(DYNAMIC) explanation and reference to partitioning guide provide helpful context for production use.


52-65: Clear workflow steps emphasize batching for efficiency.

The four-step breakdown (HiveMQ → join → function → postgresql) effectively explains the batching strategy. Tuning guidance for array size and timeout is practical and addresses performance/latency tradeoffs.


67-71: Security note strengthens guidance on safe data ingestion.

The explicit recommendation for parameterized queries and warning about escaping is valuable for practitioners building on this example. Positioning after the workflow steps keeps the security context clear.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl merged commit af7d264 into main Oct 30, 2025
3 checks passed
@amotl amotl deleted the less-loading branch October 30, 2025 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintenance C'est la vie. sanding-1200 Fine sanding.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants