Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
1d8bab2
Add initial attempt at adding process related tags on trace payloads.…
wantsui Nov 7, 2025
58592a3
Add test for multiple calls to the formatter tags
wantsui Nov 10, 2025
7dc9184
Add tests for trace formatter spec to assert that the first span of t…
wantsui Nov 10, 2025
cad26a6
it turns out you cannot just pin things to rails 7 due to newer ruby …
wantsui Nov 10, 2025
f31440a
Update lib/datadog/core/environment/process.rb
wantsui Nov 10, 2025
cfec602
fix string and rename formatted_process_tags_k1_v1 to serialized
wantsui Nov 10, 2025
8dae705
remove unneeded line
wantsui Nov 10, 2025
055586f
remove server type for now until more research is done
wantsui Nov 10, 2025
cacb500
Add new tag normalizer logic following the trace agent.
wantsui Nov 11, 2025
7661a3f
lint fix
wantsui Nov 11, 2025
7825940
add missing files from prototype command
wantsui Nov 11, 2025
5de6efd
Add missing constants to ext rbs file
wantsui Nov 11, 2025
f5ca84a
jruby fix for the process spec
wantsui Nov 11, 2025
9ad5be5
remove the active record during rails creation because it caused a jr…
wantsui Nov 11, 2025
ceaf7d4
Swap out the existing headers normalization logic with the tag normal…
wantsui Nov 11, 2025
0848833
update the comments for the test
wantsui Nov 12, 2025
9fc4009
fix another comment
wantsui Nov 12, 2025
a66e635
Bring tag normalization to 1:1 parity with the Trace Agent
wantsui Nov 13, 2025
ec1e930
Add changes from code review around comments and add test for the new…
wantsui Nov 13, 2025
4073ab5
Merge branch 'master' into add-process-tags-to-tracing
wantsui Nov 13, 2025
22a3680
Remove the rails gem install from process_spec
wantsui Nov 14, 2025
5784833
Remove 1 sec delay.
wantsui Nov 14, 2025
2b705e3
Update sig/datadog/core/environment/ext.rbs
wantsui Nov 14, 2025
e3deb4c
Update lib/datadog/tracing/transport/trace_formatter.rb
wantsui Nov 14, 2025
4747259
Add improvements for long strings.
wantsui Nov 14, 2025
41bc6c0
small improvement to the whitespace removal.
wantsui Nov 14, 2025
c3605c0
Add upper bound to regex to avoid the polynomial regex on uncontrolle…
wantsui Nov 14, 2025
adfa416
Change untyped to string.
wantsui Nov 14, 2025
0dff545
Use possessive quantifiers in regex instead of limiting the upper bou…
wantsui Nov 14, 2025
7d8da40
Fix types for steep check command
wantsui Nov 14, 2025
31d9796
Remove unneeded Core prefix
wantsui Nov 14, 2025
3672a8a
lint fixes
wantsui Nov 14, 2025
23d9769
restructure folder lookup so it works on the macos ci tests
wantsui Nov 14, 2025
7615906
fixes for local mac development.
wantsui Nov 17, 2025
d4c6a91
Add missing trace agent test cases.
wantsui Nov 17, 2025
433b250
Fix lint
wantsui Nov 18, 2025
47efb90
Change methods to private. Also add comments with examples
wantsui Nov 18, 2025
a2643a6
Fix basedir logic and adjust tests (and also fix the private change)
wantsui Nov 18, 2025
ccd4971
Fix steepcheck error
wantsui Nov 18, 2025
be9587d
Add in byte logic to handle emojis with early backoff and allow start…
wantsui Nov 18, 2025
6042830
Move process tags only to the first span and adjust tests
wantsui Nov 18, 2025
4210d74
Add a special character into the test app name to show that it gets n…
wantsui Nov 19, 2025
f9af946
Update lib/datadog/core/normalizer.rb
wantsui Nov 19, 2025
381fbe2
Update lib/datadog/core/normalizer.rb
wantsui Nov 19, 2025
138dff8
Fixes for new constant names
wantsui Nov 19, 2025
2449153
Change to byteslice
wantsui Nov 19, 2025
5252259
fix lint.
wantsui Nov 19, 2025
0eaf302
remove process_spec from main rake task
wantsui Nov 19, 2025
fbfecfe
Update spec/datadog/core/normalizer_spec.rb
wantsui Nov 20, 2025
cc2225f
Update spec/datadog/tracing/transport/trace_formatter_spec.rb
wantsui Nov 20, 2025
62822ab
Merge branch 'master' into add-process-tags-to-tracing
wantsui Nov 20, 2025
a77e63b
Remove the unless check and replace with an assertion that the file e…
wantsui Nov 20, 2025
439e81a
Update spec/datadog/core/environment/process_spec.rb
wantsui Nov 20, 2025
9e45ade
fix lint
wantsui Nov 20, 2025
0ab4fef
Rename Normalizer to TagNormalizer.
wantsui Nov 20, 2025
6577d3f
Update lib/datadog/core/environment/process.rb
wantsui Nov 20, 2025
a336c66
Add api private comment to the tag normalizer and refactor away the e…
wantsui Nov 20, 2025
0d229de
Fix steep errors on the process rbs file
wantsui Nov 20, 2025
e83bc4a
Refactor the utils encode call so it can be used in the tag normalize…
wantsui Nov 21, 2025
ce1759f
Update Rakefile
wantsui Nov 21, 2025
8b978c6
Update lib/datadog/core/tag_normalizer.rb
wantsui Nov 21, 2025
f4c9d49
Add lint fixes and remove unneeded regex at the end.
wantsui Nov 21, 2025
eae4eb9
fix rbs file for deleted variable
wantsui Nov 21, 2025
f3f1480
remove unneeded conditional
wantsui Nov 21, 2025
b48d20d
Add a log if the process tags cannot be obtained
wantsui Nov 21, 2025
3d33291
Fix regex and reuse the same test cases to show that the leading digi…
wantsui Nov 21, 2025
4e3f8f4
Attempt to retrieve as many non empty string process tags as possible…
wantsui Nov 21, 2025
2f4d1c6
Merge branch 'add-process-tags-to-tracing' into replace-headers-tags-…
wantsui Nov 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Matrixfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@
'core_with_libdatadog_api' => {
'' => '✅ 2.5 / ✅ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ 3.4 / ✅ 3.5 / ❌ jruby',
},
'core_with_rails' => {
'rails8' => '❌ 2.5 / ❌ 2.6 / ❌ 2.7 / ❌ 3.0 / ❌ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ 3.4 / ✅ 3.5 / ❌ jruby',
},
'error_tracking' => {
'' => '❌ 2.5 / ❌ 2.6 / ✅ 2.7 / ✅ 3.0 / ✅ 3.1 / ✅ 3.2 / ✅ 3.3 / ✅ 3.4 / ✅ 3.5 / ❌ jruby',
},
Expand Down
10 changes: 8 additions & 2 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,13 @@ namespace :spec do
:graphql, :graphql_unified_trace_patcher, :graphql_trace_patcher, :graphql_tracing_patcher,
:rails, :railsredis, :railsredis_activesupport, :railsactivejob,
:elasticsearch, :http, :redis, :sidekiq, :sinatra, :hanami, :hanami_autoinstrument,
:profiling, :core_with_libdatadog_api, :error_tracking, :open_feature]
:profiling, :core_with_libdatadog_api, :error_tracking, :open_feature, :core_with_rails]

desc '' # "Explicitly hiding from `rake -T`"
RSpec::Core::RakeTask.new(:main) do |t, args|
t.pattern = 'spec/**/*_spec.rb'
t.exclude_pattern = 'spec/**/{appsec/integration,contrib,benchmark,redis,auto_instrument,opentelemetry,open_feature,profiling,crashtracking,error_tracking,rubocop,data_streams}/**/*_spec.rb,' \
' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature}_spec.rb,' \
' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature,process}_spec.rb' \
' spec/datadog/gem_packaging_spec.rb'
t.rspec_opts = args.to_a.join(' ')
end
Expand Down Expand Up @@ -233,6 +233,12 @@ namespace :spec do
end
# rubocop:enable Style/MultilineBlockChain

desc '' # "Explicitly hiding from `rake -T`"
RSpec::Core::RakeTask.new(:core_with_rails) do |t, args|
t.pattern = 'spec/datadog/core/environment/process_spec.rb'
t.rspec_opts = args.to_a.join(' ')
end

desc '' # "Explicitly hiding from `rake -T`"
RSpec::Core::RakeTask.new(:error_tracking) do |t, args|
t.pattern = 'spec/datadog/error_tracking/**/*_spec.rb'
Expand Down
10 changes: 10 additions & 0 deletions lib/datadog/core/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1003,6 +1003,16 @@ def initialize(*_)
end
end

# Enable experimental process tags propagation such that payloads like spans contain the process tag.
#
# @default `DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED` environment variable, otherwise `false`
# @return [Boolean]
option :experimental_propagate_process_tags_enabled do |o|
o.env 'DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED'
o.default false
o.type :bool
end

# Tracer specific configuration starting with APM (e.g. DD_APM_TRACING_ENABLED).
# @public_api
settings :apm do
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ module Configuration
"DD_ERROR_TRACKING_HANDLED_ERRORS" => {version: ["A"]},
"DD_ERROR_TRACKING_HANDLED_ERRORS_INCLUDE" => {version: ["A"]},
"DD_EXPERIMENTAL_FLAGGING_PROVIDER_ENABLED" => {version: ["A"]},
"DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED" => {version: ["A"]},
"DD_GIT_COMMIT_SHA" => {version: ["A"]},
"DD_GIT_REPOSITORY_URL" => {version: ["A"]},
"DD_HEALTH_METRICS_ENABLED" => {version: ["A"]},
Expand Down
6 changes: 6 additions & 0 deletions lib/datadog/core/environment/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,14 @@ module Ext
LANG_INTERPRETER = "#{RUBY_ENGINE}-#{RUBY_PLATFORM}"
LANG_PLATFORM = RUBY_PLATFORM
LANG_VERSION = RUBY_VERSION
PROCESS_TYPE = 'script' # Out of the options [jar, script, class, executable], we consider Ruby to always be a script
RUBY_ENGINE = ::RUBY_ENGINE # e.g. 'ruby', 'jruby', 'truffleruby'
TAG_ENV = 'env'
TAG_ENTRYPOINT_BASEDIR = "entrypoint.basedir"
TAG_ENTRYPOINT_NAME = "entrypoint.name"
TAG_ENTRYPOINT_WORKDIR = "entrypoint.workdir"
TAG_ENTRYPOINT_TYPE = "entrypoint.type"
TAG_PROCESS_TAGS = "_dd.tags.process"
TAG_SERVICE = 'service'
TAG_VERSION = 'version'

Expand Down
70 changes: 70 additions & 0 deletions lib/datadog/core/environment/process.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# frozen_string_literal: true

require_relative 'ext'
require_relative '../tag_normalizer'

module Datadog
module Core
module Environment
# Retrieves process level information such that it can be attached to various payloads
#
# @api private
module Process
# This method returns a key/value part of serialized tags in the format of k1:v1,k2:v2,k3:v3
# @return [String] comma-separated normalized key:value pairs
def self.serialized
return @serialized if defined?(@serialized)
tags = []

begin
workdir = TagNormalizer.normalize(entrypoint_workdir.to_s, remove_digit_start_char: false)
tags << "#{Environment::Ext::TAG_ENTRYPOINT_WORKDIR}:#{workdir}" unless workdir.empty?

entry_name = TagNormalizer.normalize(entrypoint_name.to_s, remove_digit_start_char: false)
tags << "#{Environment::Ext::TAG_ENTRYPOINT_NAME}:#{entry_name}" unless entry_name.empty?

basedir = TagNormalizer.normalize(entrypoint_basedir.to_s, remove_digit_start_char: false)
tags << "#{Environment::Ext::TAG_ENTRYPOINT_BASEDIR}:#{basedir}" unless basedir.empty?

tags << "#{Environment::Ext::TAG_ENTRYPOINT_TYPE}:#{TagNormalizer.normalize(entrypoint_type, remove_digit_start_char: false)}"
rescue => e
Datadog.logger.debug("failed to get process_tags: #{e.message}")
end
@serialized = tags.join(',').freeze
end

# Returns the last segment of the working directory of the process
# Example: /app/myapp -> myapp
# @return [String] the last segment of the working directory
def self.entrypoint_workdir
File.basename(Dir.pwd)
end

# Returns the entrypoint type of the process
# In Ruby, the entrypoint type is always 'script'
# @return [String] the type of the process, which is fixed in Ruby
def self.entrypoint_type
Environment::Ext::PROCESS_TYPE
end

# Returns the last segment of the base directory of the process
# Example 1: /bin/mybin -> mybin
# Example 2: ruby /test/myapp.rb -> myapp
# @return [String] the last segment of base directory of the script
def self.entrypoint_name
File.basename($0)
end

# Returns the last segment of the base directory of the process
# Example 1: /bin/mybin -> bin
# Example 2: ruby /test/myapp.js -> test
# @return [String] the last segment of the base directory of the script
def self.entrypoint_basedir
File.basename(File.expand_path(File.dirname($0)))
end

private_class_method :entrypoint_workdir, :entrypoint_type, :entrypoint_name, :entrypoint_basedir
end
end
end
end
63 changes: 63 additions & 0 deletions lib/datadog/core/tag_normalizer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# frozen_string_literal: true

require_relative 'utils'

module Datadog
module Core
# @api private
module TagNormalizer
# Normalization logic used for tag keys and values that the Trace Agent has for traces
# Useful for ensuring that tag keys and values are normalized consistently
# An use case for now is Process Tags which need to be sent across various intakes (profiling, tracing, etc.) consistently

module_function

INVALID_TAG_CHARACTERS = %r{[^\p{L}0-9_\-:./]}
LEADING_INVALID_CHARS_NO_DIGITS = %r{\A[^\p{L}:]++}
LEADING_INVALID_CHARS_WITH_DIGITS = %r{\A[^\p{L}0-9:./]++}
MAX_BYTE_SIZE = 200 # Represents the max tag length
VALID_ASCII_TAG = %r{\A[a-z:][a-z0-9:./-]*\z}

# Based on https://github.com/DataDog/datadog-agent/blob/45799c842bbd216bcda208737f9f11cade6fdd95/pkg/trace/traceutil/normalize.go#L131
# Specifically:
# - Must be valid UTF-8
# - Invalid characters are replaced with an underscore
# - Leading non-letter characters are removed but colons are kept
# - Trailing non-letter characters are removed
# - Trailing underscores are removed
# - Consecutive underscores are merged into a single underscore
# - Maximum length is 200 characters
# If it's a tag value, allow it to start with a digit
# @param original_value [String] The original string
# @param remove_digit_start_char [Boolean] - whether to remove the leading digit (currently only used for tag values)
# @return [String] The normalized string
def self.normalize(original_value, remove_digit_start_char: false)
transformed_value = Utils.utf8_encode(original_value, replace_invalid: true)
transformed_value.strip!
return "" if transformed_value.empty?

return transformed_value if transformed_value.bytesize <= MAX_BYTE_SIZE &&
transformed_value.match?(VALID_ASCII_TAG)

normalized_value = transformed_value

if normalized_value.bytesize > MAX_BYTE_SIZE
normalized_value = normalized_value.byteslice(0, MAX_BYTE_SIZE)
normalized_value.scrub!("")
end

normalized_value.downcase!
normalized_value.gsub!(INVALID_TAG_CHARACTERS, '_')

# The Trace Agent allows tag values to start with a number so this logic is here too
leading_invalid_regex = remove_digit_start_char ? LEADING_INVALID_CHARS_NO_DIGITS : LEADING_INVALID_CHARS_WITH_DIGITS
normalized_value.sub!(leading_invalid_regex, "")

normalized_value.squeeze!('_') if normalized_value.include?('__')
normalized_value.delete_suffix!('_')

normalized_value
end
end
end
end
7 changes: 6 additions & 1 deletion lib/datadog/core/utils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,20 @@ def self.truncate(value, size, omission = '...')
# @param [String,#to_s] str object to be converted to a UTF-8 string
# @param [Boolean] binary whether to expect binary data in the `str` parameter
# @param [String] placeholder string to be returned when encoding fails
# @param [Boolean] replace_invalid whether to replace invalid characters (Trace Agent tags expectation)
# @return a UTF-8 string version of `str`
# @!visibility private
def self.utf8_encode(str, binary: false, placeholder: EMPTY_STRING)
def self.utf8_encode(str, binary: false, replace_invalid: false, placeholder: EMPTY_STRING)
str = str.to_s

if binary
# This option is useful for "gracefully" displaying binary data that
# often contains text such as marshalled objects
str.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '')
elsif replace_invalid
# A non binary mode that replaces invalid characters
# Main use case is to be on par with the Trace Agent's encoding logic for tag normalization
str.encode('UTF-8', invalid: :replace, undef: :replace)
elsif str.encoding == ::Encoding::UTF_8
str
elsif str.empty?
Expand Down
1 change: 1 addition & 0 deletions lib/datadog/tracing/configuration/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ module Ext
ENV_NATIVE_SPAN_EVENTS = 'DD_TRACE_NATIVE_SPAN_EVENTS'
ENV_RESOURCE_RENAMING_ENABLED = 'DD_TRACE_RESOURCE_RENAMING_ENABLED'
ENV_RESOURCE_RENAMING_ALWAYS_SIMPLIFIED_ENDPOINT = 'DD_TRACE_RESOURCE_RENAMING_ALWAYS_SIMPLIFIED_ENDPOINT'
ENV_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED = 'DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED'

# @public_api
module SpanAttributeSchema
Expand Down
16 changes: 2 additions & 14 deletions lib/datadog/tracing/metadata/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -120,20 +120,8 @@ module Headers
# By default, tags cannot create nested span tag levels:
# `allow_nested` allows you to override this behavior.
def to_tag(name, allow_nested: false)
# Tag normalization based on: https://docs.datadoghq.com/getting_started/tagging/#defining-tags.
#
# Only the following characters are accepted.
# * Alphanumerics
# * Underscores
# * Minuses
# * Colons
# * Periods
# * Slashes
#
# All other characters are replaced with an underscore.
tag = name.to_s.strip
tag.downcase!
tag.gsub!(INVALID_TAG_CHARACTERS, '_')
# Reuse the tag normalization logic from the Core::Normalizer module
tag = Core::Normalizer.normalize(name)

# Additional HTTP header normalization.
#
Expand Down
11 changes: 11 additions & 0 deletions lib/datadog/tracing/transport/trace_formatter.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# frozen_string_literal: true

require_relative '../../core/environment/identity'
require_relative '../../core/environment/process'
require_relative '../../core/environment/socket'
require_relative '../../core/environment/git'
require_relative '../../core/git/ext'
Expand Down Expand Up @@ -62,6 +63,7 @@ def format!
tag_apm_tracing_disabled!

if first_span
tag_process_tags!
tag_git_repository_url!
tag_git_commit_sha!
end
Expand Down Expand Up @@ -215,6 +217,15 @@ def tag_git_commit_sha!
first_span.set_tag(Core::Git::Ext::TAG_COMMIT_SHA, git_commit_sha)
end

def tag_process_tags!
return unless Datadog.configuration.experimental_propagate_process_tags_enabled

first_span.set_tag(
Core::Environment::Ext::TAG_PROCESS_TAGS,
Core::Environment::Process.serialized
)
end

private

def partial?
Expand Down
12 changes: 12 additions & 0 deletions sig/datadog/core/environment/ext.rbs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,18 @@ module Datadog
TAG_SERVICE: String

TAG_VERSION: String

PROCESS_TYPE: ::String

TAG_ENTRYPOINT_BASEDIR: ::String

TAG_ENTRYPOINT_NAME: ::String

TAG_ENTRYPOINT_WORKDIR: ::String

TAG_ENTRYPOINT_TYPE: ::String

TAG_PROCESS_TAGS: ::String
end
end
end
Expand Down
21 changes: 21 additions & 0 deletions sig/datadog/core/environment/process.rbs
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
module Datadog
module Core
module Environment
module Process
@serialized: ::String

def self.serialized: () -> ::String

private

def self.entrypoint_workdir: () -> ::String

def self.entrypoint_type: () -> ::String

def self.entrypoint_name: () -> ::String

def self.entrypoint_basedir: () -> ::String
end
end
end
end
14 changes: 14 additions & 0 deletions sig/datadog/core/tag_normalizer.rbs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
module Datadog
module Core
module TagNormalizer
INVALID_TAG_CHARACTERS: ::Regexp
LEADING_INVALID_CHARS_NO_DIGITS: ::Regexp
LEADING_INVALID_CHARS_WITH_DIGITS: ::Regexp
MAX_BYTE_SIZE: ::Integer
VALID_ASCII_TAG: ::Regexp

def self.normalize: (untyped original_value, ?remove_digit_start_char: bool) -> ::String
end
end
end

2 changes: 1 addition & 1 deletion sig/datadog/core/utils.rbs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ module Datadog

EMPTY_STRING: untyped
def self.truncate: (untyped value, untyped size, ?::String omission) -> untyped
def self.utf8_encode: (untyped str, ?binary: bool, ?placeholder: untyped) -> untyped
def self.utf8_encode: (untyped str, ?binary: bool, ?replace_invalid: bool, ?placeholder: untyped) -> untyped

def self.encode_tags: (untyped hash) -> untyped
def self.without_warnings: () { () -> untyped } -> untyped
Expand Down
Loading