You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/integrate/llamaindex/tutorial.md
+13-12Lines changed: 13 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,20 +3,23 @@
3
3
4
4
## Introduction
5
5
6
-
[LlamaIndex](https://www.llamaindex.ai/) is a data framework for Large Language Models (LLMs). It comes with pre-trained models on massive public datasets such as [GPT-4](https://openai.com/index/gpt-4/) or [Llama 4](https://www.llama.com/models/llama-4/) and provides an interface to external data sources allowing for natural language querying on your private data.
6
+
[LlamaIndex](https://www.llamaindex.ai/) is a data framework for Large Language Models (LLMs).
7
+
It integrates with models such as [GPT‑4](https://openai.com/index/gpt-4/) or [Llama 4](https://www.llama.com/models/llama-4/) and provides interfaces to external data sources for natural‑language querying of your private data.
7
8
8
-
[Azure Open AI Service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) is a fully managed service that runs on the Azure global infrastructure and allows developers to integrate OpenAI models into their applications. Through Azure Open AI API one can easily access a wide range of AI models in a scalable and reliable way.
9
+
[Azure OpenAI Service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) is a fully managed service on the Azure global infrastructure that lets developers integrate OpenAI models into applications. Through the Azure OpenAI API, you can access a wide range of AI models in a scalable and reliable way.
9
10
10
-
In this tutorial, we will illustrate how to augment existing LLMs with data stored in CrateDB through the LlamaIndex framework and Azure Open AI Service. By doing this, you will be able to use the power of generative AI models with your own data in just a few lines of code.
11
+
This tutorial shows how to augment LLMs with data stored in CrateDB using LlamaIndex and Azure OpenAI, enabling natural‑language queries over your data.
11
12
12
13
If you want to run this in your own environment, we've provided all of the code and supporting resources that you'll need in the [`cratedb-examples`](https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/llama-index) GitHub repository.
13
14
14
15
## Prerequisites
15
16
16
17
* Python 3.10 or higher
17
18
* Recent version of LlamaIndex, please follow the [installation instructions](https://gpt-index.readthedocs.io/en/latest/getting_started/installation.html)
19
+
*`sqlalchemy-cratedb`
20
+
*`SQLAlchemy` (if not pulled transitively)
18
21
* Running instance of [CrateDB](https://console.cratedb.cloud/)
19
-
*[An Azure subscription](https://azure.microsoft.com/en-gb/free/cognitive-services/) and [Azure OpenAI resource](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal) in your desired subscription
22
+
*[Azure subscription](https://azure.microsoft.com/en-gb/free/cognitive-services/) and [Azure OpenAI resource](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal)
20
23
21
24
## Deploy models in Azure OpenAI
22
25
@@ -28,9 +31,9 @@ To deploy the models required for this tutorial, follow these steps:
2. This will open Azure AI Studio. Azure AI Studio enables developers to build, run, and deploy AI applications. Click on the *Create new deployment*button to deploy the following models:
32
-
1.**GPT-35-turbo** for text generation tasks
33
-
2.**text-embedding-ada-002** for generating embeddings
34
+
2. This opens Azure AI Studio. Click *Create new deployment*and deploy:
35
+
1.A chat/completions model (e.g., **gpt-4o-mini**)
36
+
2.An embeddings model (e.g., **text-embedding-3-large**)
34
37
35
38
The basic deployment of each model is straightforward in Azure OpenAI Studio: You need to select the model you want to deploy and specify the unique name:
36
39
@@ -136,7 +139,6 @@ We use sqlalchemy, a popular SQL database toolkit, to connect to CrateDB and SQL
The value of `CRATEDB_SQLALCHEMY_URL` should be a URL format connection string containing the hostname, username and password for your CrateDB instance:
142
144
@@ -153,8 +155,7 @@ sql_database = SQLDatabase(
153
155
)
154
156
```
155
157
156
-
Then use that to create am instance of `NLSQLTableQueryEngine`:
157
-
158
+
Then create an instance of `NLSQLTableQueryEngine`:
Often, we are also interested in the query that produces the output. This is included in the answer's metadata:
184
185
185
186
```python
186
-
print(answer.metadata))
187
+
print(answer.metadata)
187
188
# {'result': [(17.033333333333335,)], 'sql_query': 'SELECT AVG(value) FROM time_series_data WHERE sensor_id = 1'}
188
189
```
189
190
190
191
## Takeaway
191
192
192
193
In this tutorial, we've embarked on the journey of using a natural language interface to query CrateDB data. We've explored how to seamlessly connect your data to the power of LLM using LlamaIndex and the capabilities of Azure OpenAI.
193
194
194
-
This tutorial is just the beginning. You can expect further resources, documentation, and tutorials related to CrateDB and generative AI from us. Also, stay tuned for the CrateDB 5.5 release: we will soon announce the support for the vector store and search, allowing you to implement similarity-based data retrieval efficiently.
195
+
This tutorial is a starting point. Explore additional resources on CrateDB and generative AI as they become available.
195
196
196
197
If you want to try this out yourself, you can find the full example code and supporting resources in the [`cratedb-examples` GitHub repository](https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/llama-index).
0 commit comments