-
Notifications
You must be signed in to change notification settings - Fork 4
Spike: Adding Llamaindex llm instrumentation #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add embedding event handlers (_handle_embedding_start, _handle_embedding_end) - Extract model name, input texts, and dimension count from embedding events - Create vendor_detection.py module with VendorRule-based provider detection - Support 13+ embedding providers (OpenAI, Azure, AWS, Google, Cohere, etc.) - Add test_embedding_instrumentation.py with single and batch embedding tests - Update README with embedding documentation and provider list - Tested successfully with OpenAI embeddings API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we change the description and remove/change the authors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am keeping consistent with the description and authors for langchain.
| wrapper=_BaseCallbackManagerInitWrapper(llamaindexCallBackHandler), | ||
| ) | ||
|
|
||
| def _uninstrument(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we implement this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in #83
...-instrumentation-llamaindex/src/opentelemetry/instrumentation/llamaindex/callback_handler.py
Show resolved
Hide resolved
| response = payload.get("response") | ||
|
|
||
| # Handle both dict and object types for response | ||
| if response: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick which might be unrealistic, but would it make sense to group the dictionary vs non-dictionary use cases? Seems like there is a lot of isinstance("", dict) checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored and changed in #83
Core Implementation:
_handle_llm_start(): Captures model name, input messages, creates LLMInvocation entity
_handle_llm_end(): Extracts response content, token usage, finalizes span
Handles LlamaIndex-specific message formats (ChatMessage objects, blocks[0].text structure)
Supports both dict and object response types
Token extraction from response.raw.usage (OpenAI format)
Testing:
test_llm_instrumentation.py: Live OpenAI API integration test
Validates span attributes, message content, token counts
Verifies metrics emission
Documentation:
Quick start guide with code example
Expected span attributes and metrics output
Key differences vs LangChain (event-based callbacks, CBEventType enum)
Span Attributes (Gen AI Semantic Conventions)
Metrics:
gen_ai.client.operation.duration (histogram)
gen_ai.client.token.usage (counter)