Skip to content
This repository was archived by the owner on May 10, 2024. It is now read-only.

Commit 58549d8

Browse files
committed
Add code example
1 parent 92381c8 commit 58549d8

File tree

1 file changed

+50
-1
lines changed

1 file changed

+50
-1
lines changed

docs/integrations/braintrust.md

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,57 @@ title: Braintrust
55

66
[Braintrust](braintrustdata.com) is the enterprise-grade stack for building AI products. They provide tools including: evaluations, prompt playground, dataset management, tracing, etc.
77

8-
It's easy to use Braintrust to evaluate AI apps built with Chroma. Braintrust provides a Typescript and Python library to run and log evaluations.
8+
It's easy to use Braintrust to evaluate AI retrieval apps built with Chroma. Braintrust provides a Typescript and Python library to run and log evaluations.
99

1010
- [Tutorial: Evaluate Chroma Retrieval app w/ Braintrust](https://www.braintrustdata.com/docs/examples/rag)
1111

12+
Example evaluation script in Python:
13+
(refer to the tutorial above to get the full implementation)
14+
```python
15+
from autoevals.llm import *
16+
from braintrust import Eval
17+
18+
PROJECT_NAME="Chroma_Eval"
19+
20+
from openai import OpenAI
21+
22+
client = OpenAI()
23+
leven_evaluator = LevenshteinScorer()
24+
25+
async def pipeline_a(input, hooks=None):
26+
# Get a relevant fact from Chroma
27+
relevant = collection.query(
28+
query_texts=[input],
29+
n_results=1,
30+
)
31+
relevant_text = ','.join(relevant["documents"][0])
32+
prompt = """
33+
You are an assistant called BT. Help the user.
34+
Relevant information: {relevant}
35+
Question: {question}
36+
Answer:
37+
""".format(question=input, relevant=relevant_text)
38+
messages = [{"role": "system", "content": prompt}]
39+
response = client.chat.completions.create(
40+
model="gpt-3.5-turbo",
41+
messages=messages,
42+
temperature=0,
43+
max_tokens=100,
44+
)
45+
46+
result = response.choices[0].message.content
47+
return result
48+
49+
# Run an evaluation and log to Braintrust
50+
await Eval(
51+
PROJECT_NAME,
52+
# define your test cases
53+
data = lambda:[{"input": "What is my eye color?", "expected": "Brown"}],
54+
# define your retrieval pipeline w/ Chroma above
55+
task = pipeline_a,
56+
# use a prebuilt scoring function or define your own :)
57+
scores=[leven_evaluator],
58+
)
59+
```
60+
1261
Learn more on their [docs](https://www.braintrustdata.com/docs).

0 commit comments

Comments
 (0)