You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[MLOB-3525] add setup instructions for llm obs litellm integration (DataDog#20911)
* add setup instructions for llm obs litellm integration
* add apm section
* remove detailed llm obs and apm sections in favor of linking to llm obs public documentation
Copy file name to clipboardExpand all lines: litellm/README.md
+19-9Lines changed: 19 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,28 +1,37 @@
1
-
# Agent Check: LiteLLM
1
+
# LiteLLM
2
2
3
3
## Overview
4
4
5
-
[LiteLLM][1] is a lightweight, open-source proxy and analytics layer for large language model (LLM) APIs. It enables unified access, observability, and cost control across multiple LLM providers.
5
+
Monitor, troubleshoot, and evaluate your LLM-powered applications built using [LiteLLM][1]: a lightweight, open-source proxy and analytics layer for large language model (LLM) APIs. It enables unified access, observability, and cost control across multiple LLM providers.
6
6
7
-
This integration provides real-time monitoring, alerting, and analytics for all LLM API usage through LiteLLM, helping customers optimize performance, manage costs, and ensure reliability across their AI-powered applications.
7
+
Use LLM Observability to investigate the root cause of issues, monitor operational performance, and evaluate the quality, privacy, and safety of your LLM applications.
8
+
9
+
See the [LLM Observability tracing view video](https://imgix.datadoghq.com/video/products/llm-observability/expedite-troubleshooting.mp4?fm=webm&fit=max) for an example of how you can investigate a trace.
10
+
11
+
Get cost estimation, prompt and completion sampling, error tracking, performance metrics, and more out of [LiteLLM][1] Python library requests using Datadog metrics and APM.
8
12
9
13
Key metrics such as request/response counts, latency, error rates, token usage, and spend per provider or deployment are monitored. This data enables customers to track usage patterns, detect anomalies, control costs, and troubleshoot issues quickly, ensuring efficient and reliable LLM operations through LiteLLM's health check and Prometheus endpoints.
10
14
11
15
## Setup
12
16
17
+
### LLM Observability: Get end-to-end visibility into your LLM application using LiteLLM
18
+
See the [LiteLLM integration docs][12] for details on how to get started with LLM Observability for LiteLLM.
19
+
20
+
21
+
### Agent Check: LiteLLM
13
22
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.
14
23
15
-
### Installation
24
+
####Installation
16
25
17
26
Starting from Agent 7.68.0, the LiteLLM check is included in the [Datadog Agent][2] package. No additional installation is needed on your server.
18
27
19
-
### Configuration
28
+
####Configuration
20
29
21
30
This integration collects metrics through the Prometheus endpoint exposed by the LiteLLM Proxy. This feature is only available for enterprise users of LiteLLM. By default, the metrics are exposed on the `/metrics` endpoint. If connecting locally, the default port is 4000. For more information, see the [LiteLLM Prometheus documentation][10].
22
31
23
32
Note: The listed metrics can only be collected if they are available. Some metrics are generated only when certain actions are performed. For example, the `litellm.auth.failed_requests.count` metric might only be exposed after an authentication failed request has occurred.
24
33
25
-
#### Host-based
34
+
#####Host-based
26
35
27
36
1. Edit the `litellm.d/conf.yaml` file in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your LiteLLM performance data. See the [sample litellm.d/conf.yaml][4] for all available configuration options. Example config:
28
37
@@ -38,7 +47,7 @@ instances:
38
47
39
48
2.[Restart the Agent][5].
40
49
41
-
#### Kubernetes-based
50
+
#####Kubernetes-based
42
51
43
52
For LiteLLM Proxy running on Kubernetes, configuration can be easily done via pod annotations. See the example below:
44
53
@@ -69,11 +78,11 @@ spec:
69
78
70
79
For more information and alternative ways to configure the check in Kubernetes-based environments, see the [Kubernetes Integration Setup documentation][3].
71
80
72
-
#### Logs
81
+
#####Logs
73
82
74
83
LiteLLM can send logs to Datadog through its callback system. You can configure various logging settings in LiteLLM to customize log formatting and delivery to Datadog for ingestion. For detailed configuration options and setup instructions, refer to the [LiteLLM Logging Documentation][11].
75
84
76
-
### Validation
85
+
####Validation
77
86
78
87
Run the Agent's status subcommand ([see documentation][6]) and look for `litellm` under the Checks section.
79
88
@@ -109,3 +118,4 @@ Need help? Contact [Datadog support][9].
0 commit comments