Skip to content

Conversation

@lrafeei
Copy link
Contributor

@lrafeei lrafeei commented Nov 19, 2025

This PR contains a barebones tracing implementation. Scaffolding exists for many features, but SpanKind.INTERNAL/BackgroundTasks/FunctionTrace are the only implementations tested in this branch.

Future PRs include (and will likely be in this order):

  • Distributed tracing/context management
  • Change enabled setting name
  • WSGI PR
  • ASGI PR
  • Scope attributes for instrumentation frameworks
  • Tracer include/exclude settings
  • Datastore PR
  • MessageQueues PR
  • gRPC PR
  • Status implementation
  • Events/logging
  • Links
  • Metrics update
  • Configuration setup
  • Sampler integration between NR and OTel

@mergify mergify bot added the merge-conflicts Merge conflicts detected. label Nov 19, 2025
@mergify mergify bot removed the merge-conflicts Merge conflicts detected. label Nov 19, 2025
@github-actions
Copy link

github-actions bot commented Nov 19, 2025

MegaLinter analysis: Error

Descriptor Linter Files Fixed Errors Warnings Elapsed time
✅ ACTION actionlint 7 0 0 0.82s
✅ MARKDOWN markdownlint 7 0 0 0 1.31s
❌ PYTHON ruff 956 5 1 0 0.98s
✅ PYTHON ruff-format 956 5 0 0 0.38s
✅ YAML prettier 15 0 0 0 1.52s
✅ YAML v8r 15 0 0 5.38s
✅ YAML yamllint 15 0 0 0.69s

Detailed Issues

❌ PYTHON / ruff - 1 error
::error title=Ruff (B026),file=newrelic/api/opentelemetry.py,line=439,col=99,endLine=439,endColumn=104::newrelic/api/opentelemetry.py:439:99: B026 Star-arg unpacking after a keyword argument is strongly discouraged

See detailed reports in MegaLinter artifacts

MegaLinter is graciously provided by OX Security

@mergify mergify bot added the tests-failing Tests failing in CI. label Nov 19, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2025

Codecov Report

❌ Patch coverage is 54.58015% with 119 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop-hybrid-core-tracing@afbc9f2). Learn more about missing BASE report.

Files with missing lines Patch % Lines
newrelic/api/opentelemetry.py 51.13% 74 Missing and 12 partials ⚠️
newrelic/hooks/hybridagent_opentelemetry.py 56.57% 25 Missing and 8 partials ⚠️
Additional details and impacted files
@@                      Coverage Diff                       @@
##             develop-hybrid-core-tracing    #1587   +/-   ##
==============================================================
  Coverage                               ?   79.96%           
==============================================================
  Files                                  ?      210           
  Lines                                  ?    24482           
  Branches                               ?     3883           
==============================================================
  Hits                                   ?    19576           
  Misses                                 ?     3539           
  Partials                               ?     1367           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lrafeei lrafeei force-pushed the hybrid-agent-trace-base branch from 13df939 to b45a7c8 Compare November 19, 2025 21:33
@lrafeei lrafeei marked this pull request as ready for review November 19, 2025 22:04
@lrafeei lrafeei requested a review from a team as a code owner November 19, 2025 22:04
Copy link
Contributor

@hmstepanek hmstepanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed this one class so far but wanted to get you the feedback now rather than waiting until next week.

# but for debug purposes, we will raise an error
_logger.debug("Otel span and NR trace do not match nor correspond to a remote span")
_logger.debug("otel span: %s\nnewrelic trace: %s", self.otel_parent, current_nr_trace)
raise ValueError("Unexpected span parent scenario encountered")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error caught somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I cannot think of a scenario where this would happen, so I wanted to keep this here for debug purposes.

_logger.debug("otel span: %s\nnewrelic trace: %s", self.otel_parent, current_nr_trace)
raise ValueError("Unexpected span parent scenario encountered")

if nr_trace_type == FunctionTrace:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a lot of questions about these. Mainly; shouldn't we be pulling all the attributes out of the OTEL attr list and mapping the here into the appropriate NR equivalents unless there isn't a spot for them in which case they are supposed to become user attributes according to the spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part will come in a separate PR. For now, the attributes are added as custom attributes, but the actual attribute mapping per SpanKind will come in separate PRs. Of course, most of the time, the attributes are not added during initialization, but rather throughout the span's operation, so I will be adding logic to add these when the span is ending.

elif nr_trace_type == DatastoreTrace:
trace_kwargs = {
"product": self.instrumenting_module,
"target": None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like target maps to OTEL's db.name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does! The issue with the OTel attributes is that the db.name attribute is usually not set during the creation of the span, but rather later on in the span's operation.

def record_exception(self, exception, attributes=None, timestamp=None, escaped=False):
if not hasattr(self, "nr_trace"):
if exception:
notice_error((type(exception), exception, exception.__traceback__))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be passing attributes here too?

else:
notice_error(sys.exc_info(), attributes=attributes)
else:
self.nr_trace.notice_error((type(exception), exception, exception.__traceback__), attributes=attributes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's no exception here?

# TODO: not implemented yet
pass

def record_exception(self, exception, attributes=None, timestamp=None, escaped=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be doing something with timestamp and escaped here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem like we will do anything with the escaped argument, and the timestamp argument will be used for the events/log events (which I have not implemented yet)

# We will ignore the end_time parameter and use NR's end_time

# Check to see if New Relic trace ever existed or,
# if it does, that trace has already ended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe store this value and simplify the logic rather than calling hasattr a bunch:

nr_trace = hasattr(self, "nr_trace", None)
if not nr_trace or nr_trace and getattr(nr_trace, "end_time", None):
    return

lrafeei and others added 9 commits November 21, 2025 15:52
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
@lrafeei lrafeei requested a review from hmstepanek November 22, 2025 00:45
lrafeei and others added 4 commits November 22, 2025 00:46
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
lrafeei and others added 4 commits November 21, 2025 16:55
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
Copy link
Contributor

@hmstepanek hmstepanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good-had a couple suggestions.


self.nr_trace.__enter__()

def _is_sampled(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these _is_sampled and _is_remote functions part of the otel api? I don't see them listed and if they aren't we should make these names more Pythonic (aka: sampled, remote). The prefixing of is comes from javascript (maybe other languages do this too but not generally Python).

# "db": Database/Datastore
# "message": Message Queue
#
INSTRUMENTING_MODULE_TYPE = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking my understanding: this is a mapping of modules to span/transaction types for modules in OTEL that don't have attributes added on span creation that would otherwise help us distinguish what type of instrumentation it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. I am hoping to eventually do away with this dictionary once I complete the configuration setup portion of this. Sadly, that will be a little bit down the road, so this mapping placeholder will fulfill that purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I realized that with some logic rewriting, I don't need to have that at all. REMOVED.

parent_span_context = None

nr_trace_type = FunctionTrace
transaction = current_transaction()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it would be wise to use this transaction to make sure we don't have one that already exists and is active before we create a new one? I see one section uses the transaction and the other section doesn't-maybe I'm just having trouble following the logic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent creates a trace (in the form of a Sentinel) if a transaction already exists and will not create a new transaction, so I did not create separate logic for that part of it

*args,
**kwargs,
):
return Tracer(resource=self._resource, instrumentation_library=instrumenting_module_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be passing args and kwargs here?

# OpenTelemetry API can record errors
@validate_transaction_metrics(name="Foo", background_task=True)
@validate_error_event_attributes(
exact_attrs={"agent": {}, "intrinsic": {"error.message": "Test exception message"}, "user": {}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be an error.class attribute on here too right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these cross agent tests, I just used what was in the JSON files, but once I work more on the status functionality, I will have more tests for errors as well.

try:
raise Exception("Test exception message")
except Exception as e:
otel_api_trace.get_current_span().record_exception(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test anywhere for making sure we don't do anything if hybrid agent is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to include that in the PR where I change the setting names, but I can do that in this one too (since it's literally just one test)

# "message": Message Queue
#
INSTRUMENTING_MODULE_TYPE = {
"Redis": "db",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change db to ds (datastore).

lrafeei and others added 2 commits November 24, 2025 17:35
Co-authored-by: Hannah Stepanek <hstepanek@newrelic.com>
@lrafeei lrafeei requested a review from hmstepanek November 25, 2025 03:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests-failing Tests failing in CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants