-
Notifications
You must be signed in to change notification settings - Fork 133
Initial Hybrid Agent Trace implementation #1587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop-hybrid-core-tracing
Are you sure you want to change the base?
Conversation
❌MegaLinter analysis: Error
Detailed Issues❌ PYTHON / ruff - 1 errorSee detailed reports in MegaLinter artifacts |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop-hybrid-core-tracing #1587 +/- ##
==============================================================
Coverage ? 79.96%
==============================================================
Files ? 210
Lines ? 24482
Branches ? 3883
==============================================================
Hits ? 19576
Misses ? 3539
Partials ? 1367 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
13df939 to
b45a7c8
Compare
hmstepanek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reviewed this one class so far but wanted to get you the feedback now rather than waiting until next week.
| # but for debug purposes, we will raise an error | ||
| _logger.debug("Otel span and NR trace do not match nor correspond to a remote span") | ||
| _logger.debug("otel span: %s\nnewrelic trace: %s", self.otel_parent, current_nr_trace) | ||
| raise ValueError("Unexpected span parent scenario encountered") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this error caught somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I cannot think of a scenario where this would happen, so I wanted to keep this here for debug purposes.
| _logger.debug("otel span: %s\nnewrelic trace: %s", self.otel_parent, current_nr_trace) | ||
| raise ValueError("Unexpected span parent scenario encountered") | ||
|
|
||
| if nr_trace_type == FunctionTrace: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of questions about these. Mainly; shouldn't we be pulling all the attributes out of the OTEL attr list and mapping the here into the appropriate NR equivalents unless there isn't a spot for them in which case they are supposed to become user attributes according to the spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part will come in a separate PR. For now, the attributes are added as custom attributes, but the actual attribute mapping per SpanKind will come in separate PRs. Of course, most of the time, the attributes are not added during initialization, but rather throughout the span's operation, so I will be adding logic to add these when the span is ending.
| elif nr_trace_type == DatastoreTrace: | ||
| trace_kwargs = { | ||
| "product": self.instrumenting_module, | ||
| "target": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like target maps to OTEL's db.name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does! The issue with the OTel attributes is that the db.name attribute is usually not set during the creation of the span, but rather later on in the span's operation.
newrelic/api/opentelemetry.py
Outdated
| def record_exception(self, exception, attributes=None, timestamp=None, escaped=False): | ||
| if not hasattr(self, "nr_trace"): | ||
| if exception: | ||
| notice_error((type(exception), exception, exception.__traceback__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be passing attributes here too?
newrelic/api/opentelemetry.py
Outdated
| else: | ||
| notice_error(sys.exc_info(), attributes=attributes) | ||
| else: | ||
| self.nr_trace.notice_error((type(exception), exception, exception.__traceback__), attributes=attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there's no exception here?
| # TODO: not implemented yet | ||
| pass | ||
|
|
||
| def record_exception(self, exception, attributes=None, timestamp=None, escaped=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be doing something with timestamp and escaped here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem like we will do anything with the escaped argument, and the timestamp argument will be used for the events/log events (which I have not implemented yet)
| # We will ignore the end_time parameter and use NR's end_time | ||
|
|
||
| # Check to see if New Relic trace ever existed or, | ||
| # if it does, that trace has already ended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe store this value and simplify the logic rather than calling hasattr a bunch:
nr_trace = hasattr(self, "nr_trace", None)
if not nr_trace or nr_trace and getattr(nr_trace, "end_time", None):
return
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
hmstepanek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks good-had a couple suggestions.
newrelic/api/opentelemetry.py
Outdated
|
|
||
| self.nr_trace.__enter__() | ||
|
|
||
| def _is_sampled(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these _is_sampled and _is_remote functions part of the otel api? I don't see them listed and if they aren't we should make these names more Pythonic (aka: sampled, remote). The prefixing of is comes from javascript (maybe other languages do this too but not generally Python).
newrelic/api/opentelemetry.py
Outdated
| # "db": Database/Datastore | ||
| # "message": Message Queue | ||
| # | ||
| INSTRUMENTING_MODULE_TYPE = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding: this is a mapping of modules to span/transaction types for modules in OTEL that don't have attributes added on span creation that would otherwise help us distinguish what type of instrumentation it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I am hoping to eventually do away with this dictionary once I complete the configuration setup portion of this. Sadly, that will be a little bit down the road, so this mapping placeholder will fulfill that purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I realized that with some logic rewriting, I don't need to have that at all. REMOVED.
newrelic/api/opentelemetry.py
Outdated
| parent_span_context = None | ||
|
|
||
| nr_trace_type = FunctionTrace | ||
| transaction = current_transaction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would be wise to use this transaction to make sure we don't have one that already exists and is active before we create a new one? I see one section uses the transaction and the other section doesn't-maybe I'm just having trouble following the logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The agent creates a trace (in the form of a Sentinel) if a transaction already exists and will not create a new transaction, so I did not create separate logic for that part of it
newrelic/api/opentelemetry.py
Outdated
| *args, | ||
| **kwargs, | ||
| ): | ||
| return Tracer(resource=self._resource, instrumentation_library=instrumenting_module_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be passing args and kwargs here?
| # OpenTelemetry API can record errors | ||
| @validate_transaction_metrics(name="Foo", background_task=True) | ||
| @validate_error_event_attributes( | ||
| exact_attrs={"agent": {}, "intrinsic": {"error.message": "Test exception message"}, "user": {}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be an error.class attribute on here too right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these cross agent tests, I just used what was in the JSON files, but once I work more on the status functionality, I will have more tests for errors as well.
| try: | ||
| raise Exception("Test exception message") | ||
| except Exception as e: | ||
| otel_api_trace.get_current_span().record_exception(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test anywhere for making sure we don't do anything if hybrid agent is disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to include that in the PR where I change the setting names, but I can do that in this one too (since it's literally just one test)
newrelic/api/opentelemetry.py
Outdated
| # "message": Message Queue | ||
| # | ||
| INSTRUMENTING_MODULE_TYPE = { | ||
| "Redis": "db", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change db to ds (datastore).
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>

This PR contains a barebones tracing implementation. Scaffolding exists for many features, but
SpanKind.INTERNAL/BackgroundTasks/FunctionTraceare the only implementations tested in this branch.Future PRs include (and will likely be in this order):