-
-
Notifications
You must be signed in to change notification settings - Fork 6
Propose a Change to the Browser Trace Model #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| * Start Date: 2022-09-28 | ||
| * RFC Type: decision | ||
| * RFC PR: <link> | ||
|
|
||
| # Summary | ||
|
|
||
| This RFC proposes to require a frontend SDK to retain a trace that it continues | ||
| until the SDK naturally ends the user session. This change is primarily to the | ||
| browser trace model to better support dynamic sampling and to create a more | ||
| coherent user experience. | ||
|
|
||
| # Motivation | ||
|
|
||
| Today there are two ways to start a trace: they can be started on the backend and then | ||
| continued on the client, or they can be started on the client. In either case however | ||
| client SDKs are likely to create more traces that are disconnected to the backend trace | ||
| on page navigation or view changes. This creates the situation that today the only sensible | ||
| way for dynamic sampling is to have traces started uniformly on the frontend project as | ||
| otherwise the dynamic sampling rules from both projects need to be modified (head trace | ||
| is where dynamic sampling rules apply). | ||
|
|
||
| The secondary motivation is that creating new traces on navigation also wipes out the | ||
| causal relationship to what happened before. For instance it's more than possible that | ||
| before a client-side navigation the state of the application corrupted, but we lose that | ||
| trace relationship and a user has to manually piece it back together by for instance | ||
| listing all transactions created by a specific user ID. | ||
|
|
||
| # Background | ||
|
|
||
| We became aware of this problem in two ways recently: | ||
|
|
||
| 1. Users want to sometimes create a transaction within another transaction. Today there | ||
| is no way to link these together for the purpose of dynamic sampling. A separate RFC | ||
| [0014](https://github.com/getsentry/rfcs/pull/14) is proposed to add an explicit way | ||
| to carry forward the sampling context for a new transaction started after an already | ||
| existing one. It works by explicitly continuing the trace. This solves part of this | ||
| issue, but it leaves out the case where the sampling context naturally moves to another | ||
| Sentry project. | ||
|
|
||
| 2. We wanted to change our own tracing integration to start tracing on the server | ||
| [Sentry PR #39349](https://github.com/getsentry/sentry/pull/39349) where this would | ||
| require mirroring the sampling settings to another Sentry project and would also affect | ||
| API requests detatched from user sessions. | ||
|
|
||
| # Supporting Data | ||
|
|
||
| The [honeycomb whitepaper on front-end observability](https://www.honeycomb.io/wp-content/uploads/2022/03/Front-end-Observability-Whitepaper-1.pdf) | ||
| recommends continung traces from the server until the natural end of the user Session: | ||
|
|
||
| > To accomplish this task, you will use the first event (page load) as the start of your trace and | ||
| > connect that first event to additional spans to build a full trace of the user session. Each span | ||
| > will represent a single thing that you want to track, such as a server request or a user click. | ||
|
|
||
| # Options Considered | ||
|
|
||
| There are multiple ways in which this problem can be addressed. | ||
|
|
||
| ## Encouraged Root Trace Project and Session Long Traces | ||
|
|
||
| In the most trivial case the recommendation to customers would be to pick one project that | ||
| starts traces for real user sessions. This could be *either* the frontend or backend, but it | ||
| should attempt to be consistent about it. In either case the client SDK should *continue the | ||
| trace* until the browser tab naturally closes. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue here is with traces that then can be multiple hours long - and so then the value of analyzing a trace is completely lost. I'm strongly against this because I feel like it'll reduce the value of trace view, and make it harder for us to expand the tracing product.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds a bit like a UX issue? I don't think that it's inherently problematic if a trace is very long, we today already do not draw much of a value out of the trace view. To me it feels at least like there is more value to the trace being connected than the trace being split into transactions on every navigation.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. User inactivity in a browser tab could be another natural way of ending a session, and beginning a new one. There could be users that leave tabs open for a very long time, and having large time gaps between transactions might seem weird. |
||
|
|
||
| The consequences are that ``startTransaction()`` always anchors to the already open trace | ||
| and dynamic sampling context. There can be an extra flag to force the start of a new trace | ||
| but that would be strongly recommended against. | ||
|
|
||
| ## Alternative A: Detaching Sampling Project from Root Project | ||
|
|
||
| An alternative approach would be to allow a transaction to start again on the client but to | ||
| continue with the sampling context that came from the server. In that case the root of the | ||
| trace is in fact the frontend for continued transactions after a page navigate, however the | ||
| sampling context is reused from the original server side request. | ||
|
|
||
| In this case the relationship of root project setting the dynamic sampling context would be | ||
| broken up and instead a transaction can explicitly pick up the sampling context of another | ||
| project but still issue disconnected traces. | ||
|
|
||
| ## Alternative B: Trace to Trace Relationships | ||
|
|
||
| A potential alternative would be to continue the current project but allow a trace to annotate | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As an aside, in my eyes there are various correlation ID mechanisms that we chat about, and it would be nice to get an RFC going that establishes them all. Off the top of my head:
The most important relationships are as follows:
If we can establish these concepts consistently it'll be a nice foundation for us to keep building off of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. big +1 here, i still feel like we are moving torwards treating a trace more like a session, and it will get more and more confusing with our existing sessions/releases and session replay (which is closes to what is the standard definition of a session) The problem we are solving for to relate transactions across a user session by exanding the inclusion of more traces in a single trace for sampling, I feel will just create more fuzzy area. Confusion for us and our users. |
||
| itself as being the successor of another trace. Our data model currently does not have a | ||
| trace to trace relationship but such a desire has come up before with session replays. In that | ||
| case when a new trace starts on the client, it can annotate itself as the successor of a prior | ||
| trace and take over that sampling context. | ||
|
|
||
| # Drawbacks | ||
|
|
||
| Not addressing this issue might result in user confusion later as front-end user sessions | ||
| are likely to originate in different projects with different sampling rules. However future | ||
| direction assumes that sampling will eventually happen adaptively in which case the user | ||
| confusion is less likely to be an issue. | ||
|
|
||
| # Unresolved questions | ||
|
|
||
| * This RFC does not attempt to address the issue of traces vs sessions for replays or other | ||
| RUM like situations. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this affects mobile as well since the concept is similar to the browser trace model, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will also affect mobile - by making ui.load and resulting interaction and navigation transactions linked.