Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Sep 14, 2025

@coderabbitai
Copy link

coderabbitai bot commented Sep 14, 2025

Walkthrough

Renames the Integrations anchor, replaces a LangChain tutorial badge with a Sphinx cross-reference and hidden toctree, and adds a new LangChain tutorial page that documents setup, examples, and a LangChain→CrateDB workflow.

Changes

Cohort / File(s) Summary
Integrations anchor rename
docs/integrate/index.md
Rename top section anchor from (integrate)= to (integrations)=; heading and content unchanged.
LangChain docs navigation
docs/integrate/langchain/index.md
Replace “Navigate to Tutorial” badge with a Sphinx {ref} to langchain-tutorial; add a hidden toctree (maxdepth: 1) referencing the tutorial.
New LangChain tutorial
docs/integrate/langchain/tutorial.md
Add tutorial "How to set up LangChain with CrateDB" with setup steps, sample table/data, Python example using OpenAI LLM + SQLDatabase/SQLDatabaseToolkit and create_sql_agent, warnings about data exposure, and links.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Docs as Docs (Tutorial)
  participant Agent as LangChain Agent
  participant LLM as LLM Provider
  participant Toolkit as SQLDatabase / Toolkit
  participant Crate as CrateDB

  User->>Docs: Follow tutorial steps
  Docs->>Toolkit: Initialize SQLDatabase.from_uri / Toolkit
  Toolkit->>Crate: Inspect schema / fetch metadata
  Crate-->>Toolkit: Schema & tables
  User->>Agent: Ask natural-language question
  Agent->>LLM: Plan / generate SQL
  LLM-->>Agent: Proposed SQL
  Agent->>Toolkit: Execute SQL
  Toolkit->>Crate: Run query
  Crate-->>Toolkit: Result rows
  Toolkit-->>Agent: Results
  Agent-->>User: Final answer
  note right of Agent: Warning: watch data exposure (production)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

Possibly related PRs

Suggested labels

new content, guidance

Suggested reviewers

  • karynzv
  • kneth

Poem

I hopped through docs and left a trail,
A hidden tree, a tutorial tale,
LangChain links and sample code,
I carrot-marked the tidy road 🥕
Ask in plain words — the agent will stroll.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "LangChain: Starter tutorial" accurately and concisely summarizes the primary change in this PR — adding a LangChain starter tutorial and related documentation under docs/integrate/langchain; it is specific, relevant, and free of noise. It is clear enough for a teammate scanning history to understand the main change without listing files or extraneous details.
Description Check ✅ Passed The PR description states the intent (adding community tutorials), provides a preview link and a reference to the related issue, and directly relates to the changeset; it therefore satisfies the lenient description check. The level of detail required by this check is low and the provided description is on-topic.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch langchain

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 337a1f1 and da79228.

📒 Files selected for processing (3)
  • docs/integrate/index.md (1 hunks)
  • docs/integrate/langchain/index.md (2 hunks)
  • docs/integrate/langchain/tutorial.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • docs/integrate/index.md
  • docs/integrate/langchain/tutorial.md
  • docs/integrate/langchain/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl
Copy link
Member Author

amotl commented Sep 16, 2025

@coderabbitai review. It looks like this has been written in active voice already? Great. Otherwise, please suggest adjustments.

@coderabbitai
Copy link

coderabbitai bot commented Sep 16, 2025

🧩 Analysis chain

Based on the information gathered, I can see this PR adds a LangChain tutorial but I need to examine the actual content to review the voice usage. Let me try to access the files mentioned in the PR.

Let me try to get the actual file content from the PR:

@amotl I'll review the PR for active voice usage. Let me trigger a comprehensive review of the changes to check for voice and provide specific suggestions if needed.


🏁 Script executed:

#!/bin/bash
# Get the list of files changed in this PR
gh pr view 292 --repo crate/cratedb-guide --json files --jq '.files[].path'

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Get the diff for the PR to see the actual changes
gh pr diff 292 --repo crate/cratedb-guide

Length of output: 5366


🏁 Script executed:

#!/bin/bash
# Look for the main tutorial file mentioned in the preview URL
fd -t f -e md -e rst | grep -E "(langchain|tutorial)" | head -10

Length of output: 354


🏁 Script executed:

#!/bin/bash
# Check for any documentation files that might contain the LangChain tutorial
fd -t f -e md -e rst | grep -i langchain

Length of output: 70


🏁 Script executed:

#!/bin/bash
# Search for files in integrate directory
fd -t f . docs/integrate/ 2>/dev/null || fd -t f . integrate/ 2>/dev/null || echo "No integrate directory found, searching broadly"

Length of output: 2668


🏁 Script executed:

#!/bin/bash
# Search more broadly for any files that might contain LangChain content
rg -l -i "langchain" --type=md --type=rst

Length of output: 291


🏁 Script executed:

#!/bin/bash
# Get the PR details to understand what files are being changed
gh pr view 292 --repo crate/cratedb-guide

Length of output: 449


🏁 Script executed:

#!/bin/bash
# List all files in the repository to understand the structure
find . -name "*.md" -o -name "*.rst" | head -20

Length of output: 673


🏁 Script executed:

#!/bin/bash
# Try to find files that were added or modified in recent commits
git log --name-only --oneline -10 | grep -E "\.(md|rst)$"

Length of output: 7779


🏁 Script executed:

#!/bin/bash
# Check if there are any files with "tutorial" in the name
fd tutorial

Length of output: 604


🏁 Script executed:

#!/bin/bash
# Search for any Python files that might be part of the tutorial
fd -e py | head -10

Length of output: 53


🏁 Script executed:

#!/bin/bash
# Look at the repository structure to understand the organization
ls -la
find . -type d -name "*integrate*" -o -name "*langchain*" 2>/dev/null

Length of output: 832


🏁 Script executed:

#!/bash/bin
# Check the current branch and see what files have been modified
git status
git diff --name-only HEAD~5..HEAD

Length of output: 255


🏁 Script executed:

#!/bin/bash
# Fix the shebang and check git status
git status
git diff --name-only HEAD~5..HEAD 2>/dev/null || git log --name-only --oneline -5

Length of output: 13117


🏁 Script executed:

#!/bin/bash
# Get more details about the PR changes with more context
gh pr view 292 --repo crate/cratedb-guide --json body,title,additions,deletions

Length of output: 296


🏁 Script executed:

#!/bin/bash
# Try to see the actual file content that was added in this PR
gh api repos/crate/cratedb-guide/pulls/292/files --jq '.[].filename'

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Get the raw patch data from the PR
gh api repos/crate/cratedb-guide/pulls/292/files --jq '.[].patch'

Length of output: 4824


🏁 Script executed:

#!/bin/bash
# Look for files that might have been added based on the preview URL structure
find . -path "*/integrate/langchain*" -type f 2>/dev/null || find . -name "*langchain*" -type f 2>/dev/null

Length of output: 74


🏁 Script executed:

#!/bin/bash
# Check if there are any source files in the docs directory
find docs -name "*.md" -o -name "*.rst" 2>/dev/null | head -10

Length of output: 347


🏁 Script executed:

#!/bin/bash
# Try a different approach - look at the repository structure
tree -L 3 2>/dev/null || find . -type d | head -20

Length of output: 3440


🏁 Script executed:

#!/bin/bash
# Get the content of files changed in the PR
gh api repos/crate/cratedb-guide/pulls/292/files --jq '.[] | {filename: .filename, status: .status, patch: .patch}'

Length of output: 5250


@amotl I've reviewed the PR for active voice usage. You're absolutely right that most of the content is already well-written in active voice! The tutorial uses clear, direct instructions and imperative sentences effectively. However, I found a few opportunities to strengthen the active voice further:

Specific suggestions for improved active voice:

Line 6 - Current:

"For this tutorial, we are going to use it to interact with CrateDB using only natural language without writing any SQL."

Suggested (more direct active voice):

"This tutorial uses LangChain to interact with CrateDB using only natural language without writing any SQL."

Line 38 - Current:

"Once installed, import every component that will be used as follows."

Suggested:

"Once installed, import the required components:"

Line 71 - Current:

"This tutorial covered the use of LangChain to interact with CrateDB by simply writing questions in English."

Suggested:

"This tutorial demonstrates how to use LangChain to interact with CrateDB by writing questions in English."

Overall Assessment

The tutorial is already written predominantly in active voice with clear, actionable instructions. The imperative mood used throughout ("check the deployment options", "create and populate the table", "install the required libraries") is excellent for tutorial content. The suggestions above would eliminate the remaining first-person references and make the language even more direct and active.

The content follows best practices for technical documentation voice, so these are minor refinements rather than major issues.


🧠 Learnings used
Learnt from: amotl
PR: crate/cratedb-guide#0
File: :0-0
Timestamp: 2025-09-15T17:42:25.449Z
Learning: When reviewing documentation, use active voice instead of first-person or passive voice, and emit corresponding suggestions.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@amotl amotl marked this pull request as ready for review September 16, 2025 10:53
@amotl
Copy link
Member Author

amotl commented Sep 16, 2025

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Sep 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

1 similar comment
@coderabbitai
Copy link

coderabbitai bot commented Sep 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
docs/integrate/langchain/tutorial.md (3)

56-60: Switch to ChatOpenAI and make the CrateDB URI explicit.

Use chat model and show a realistic CrateDB URI (local or Cloud).

-llm=OpenAI(temperature=0) #play around with this parameter
-#change the URI below to match your CrateDB instance
-db = SQLDatabase.from_uri("crate://")
+llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
+# Change the URI below to match your CrateDB instance
+# e.g., local: "crate://localhost:4200"
+# e.g., Cloud (with TLS): "crate://<user>:<password>@<cluster>.cratedb.net:4200?ssl=true"
+db = SQLDatabase.from_uri("crate://localhost:4200")
  • Chat models are the recommended default; OpenAI integration comes from langchain-openai. (python.langchain.com)
  • CrateDB SQLAlchemy URIs and Cloud ?ssl=true option. (cratedb.com)

70-76: Tighten safety guidance (least-privilege DB user).

Add a short note to use a least-privileged database user when connecting LLM agents to production data, and avoid sending sensitive data to third-party providers unless contractually approved.

-Keep in mind, that by connecting to CrateDB with the OpenAI API, you are giving it access to perform a variety of queries on your data, so avoid using it with your production environment or use a specific user with limited permissions.
+Keep in mind that connecting an LLM agent to CrateDB enables it to run queries on your data. For production, use a least‑privileged database user and avoid sending sensitive data to external providers unless contractually approved.

80-81: Use the new anchor label for consistency.

Since (integrations)= now exists, point the ref to it.

-... have a look at the {ref}`integrations section <integrate>` ...
+... have a look at the {ref}`integrations section <integrations>` ...

Ensure both labels are intended to coexist; current index defines both. (python.langchain.com)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bddc9d0 and 337a1f1.

📒 Files selected for processing (3)
  • docs/integrate/index.md (1 hunks)
  • docs/integrate/langchain/index.md (2 hunks)
  • docs/integrate/langchain/tutorial.md (1 hunks)
🔇 Additional comments (3)
docs/integrate/langchain/index.md (2)

74-74: LGTM!

The replacement of the external badge navigation with an internal Sphinx cross-reference {ref}langchain-tutorial`` provides better integration with the documentation structure and maintains consistency with Sphinx conventions.


207-211: LGTM!

The hidden toctree properly integrates the tutorial into the documentation structure while keeping the navigation clean. The maxdepth: 1 and hidden: directives are appropriate for this use case.

docs/integrate/index.md (1)

2-2: Alias label looks good for backward-compatible refs.

Adding (integrations)= while keeping (integrate)= preserves existing cross-refs and enables the new label. No action needed.

Comment on lines 37 to 38
pip install 'langchain[openai]' 'sqlalchemy-cratedb'
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update install instructions to current LangChain packaging.

Use separate integration packages; langchain[openai] is outdated. Recommend:

-pip install 'langchain[openai]' 'sqlalchemy-cratedb'
+pip install -U langchain-community langchain-openai sqlalchemy-cratedb
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pip install 'langchain[openai]' 'sqlalchemy-cratedb'
```
pip install -U langchain-community langchain-openai sqlalchemy-cratedb
🤖 Prompt for AI Agents
In docs/integrate/langchain/tutorial.md around lines 37-38, the pip install
guidance uses the outdated monolithic specifier `langchain[openai]`; update the
instructions to install the new integration packages instead — replace the
single install line with a command that installs `langchain-openai` and
`langchain-community` (and keep/mention any required DB libs like `sqlalchemy`
separately), and update or add the two links to the LangChain docs for OpenAI
and community integrations so readers use the current packaging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants