-
Notifications
You must be signed in to change notification settings - Fork 140
OpenAI Agents Tracing: Run span start/finish in the same context #1197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tconley1428
wants to merge
5
commits into
main
Choose a base branch
from
openai/context_tracing
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+115
−34
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
84e45aa
Run span start/finish in the same context
tconley1428 62241c8
Add child workflow test to openai tracing
tconley1428 4097beb
Isolate memory between tests
tconley1428 24e9882
Merge branch 'main' into openai/context_tracing
tconley1428 ad55d51
Merge remote-tracking branch 'origin/main' into openai/context_tracing
tconley1428 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you have to run the things in the copied context (and add stack trace layers and such) just to be able to have the done callback use copied context. A quick glance at CPython shows that's what they're doing in
add_done_callbackwhen context isNone(becausecopy_contextis cheap and you only need an the copy for it to work properly).Can just copy context up front and provide to add_done_callback without changing how these things are run. Granted all of that span stuff should be done before the copy_context. Also, in Python 3.12, there is actually a
handle.get_context()you could use foradd_done_callback, but understood that is newer than our oldest allowed Python version.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't appear to fix the problem. If the context is copied up front and then the subsequent operations are not run inside it, it does still fails to detach as the copy and the original are not the same context.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context vars are the same though (assuming you copy the context after you mutate things on the context). Is there somewhere inside OpenAI that validates that a context is the exact instance? If you look at CPython code for
add_done_callbackat https://github.com/python/cpython/blob/v3.14.0/Lib/asyncio/futures.py#L236-L237, the default also callscopy_contextbecause that's the normal thing to do to "get the current context" to execute under. How does that work today whencontextisNoneforadd_done_callback? Or are you sayingadd_done_callbackdoes not work today for OpenAI'sspan.finishwhen using default parameter forcontext?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still occurs with your suggestion. I think that despite the contexts having the same values, they aren't the same. That's my best guess anyway.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, how did
add_done_callbackwork withspan.finishbefore this PR since that callsspan.finishon a copied context I wonder (since that is the default)? Or didspan.finishnever work? I think this curiosity is one of those worth understanding. I can make my own small replications to develop understanding if necessary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It never 'worked' in the sense that this issue was always here, but it only appears if you use a different tracing provider than the default. OpenAI's default handles it fine, but the instrumentor from the report uses otel which has this context detach log.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, OTel already won't work for
add_done_callbackfor workflows since they are distributed. I think we need to require being able to finish a span in a different context (and different machine) from where it was created or take a different approach.(I am curious how OTel is used here anyways due to inherent OTel limitations concerning deterministic span/trace IDs and such, though I understand that's a different topic, but we may need to suggest the start-and-stop-span-immediately approach for OTel-based OpenAI tracing that we do for our other OTel-based tracing)