feat: add process tags to traces #5033

wantsui · 2025-11-07T22:46:53Z

What does this PR do?

The goal of AIDM-253 is to add process tags to the trace payloads.

After this gets merged, the next step is to add it for the other products.

To run the tests in docker

docker compose run --rm tracer-3.3 /bin/bash

Main tests:

bundle install --gemfile /app/gemfiles/ruby_3.3_rails8.gemfile
bundle exec rake compile
bundle exec rake test:core_with_rails
TEST_DATADOG_INTEGRATION=1 BUNDLE_GEMFILE=/app/gemfiles/ruby_3.3_rails8.gemfile bundle exec rspec spec/datadog/core/environment/process_spec.rb


bundle exec rspec spec/datadog/tracing/transport/trace_formatter_spec.rb
bundle exec rspec spec/datadog/core/tag_normalizer_spec.rb
bundle exec rspec spec/datadog/core/configuration/settings_spec.rb

Motivation:

We're trying to add process tags to various payloads so they can be used in different use cases.

Note I still want to try adding server type but I'll have to tackle that in a separate PR.

Change log entry

Adds process tags to trace payloads with the new environment variable: DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED.

Additional Notes:

How to test the change?

… This is still missing memoization and additional tests.

github-actions · 2025-11-07T22:47:06Z

👋 Hey @DataDog/ruby-guild, please fill "Change log entry" section in the pull request description.

If changes need to be present in CHANGELOG.md you can state it this way

**Change log entry**

Yes. A brief summary to be placed into the CHANGELOG.md

(possible answers Yes/Yep/Yeah)

Or you can opt out like that

**Change log entry**

None.

(possible answers No/Nope/None)

^{Visited at: 2025-11-20 21:39:31 UTC}

datadog-official · 2025-11-07T22:51:11Z

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: e83bc4a | Docs | Datadog PR Page | Was this helpful? Give us feedback!}

lib/datadog/core/environment/process.rb

lib/datadog/tracing/configuration/settings.rb

marcotc · 2025-11-10T21:12:26Z

lib/datadog/tracing/transport/trace_formatter.rb

+        def tag_process_tags!
+          return unless trace.experimental_propagate_process_tags_enabled
+          process_tags = Core::Environment::Process.formatted_process_tags_k1_v1
+          return if process_tags.empty?


This is impossible right? If so, we can remove it, as it would give us a false sense of uncertainty here.

I think I fixed it in 8dae705 by just removing the check in process tags, but let me know if you spot issues with it!

…he payload has the process tag only when the feature is enabled.

…versions so this fixes that.

Co-authored-by: Marco Costa <[email protected]>

pr-commenter · 2025-11-10T22:17:05Z

Benchmarks

Benchmark execution time: 2025-11-21 00:35:39

Comparing candidate commit e83bc4a in PR branch add-process-tags-to-tracing with baseline commit 2c84fc0 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 44 metrics, 2 unstable metrics.

wantsui · 2025-11-11T16:29:31Z

spec/datadog/tracing/transport/trace_formatter_spec.rb

+          format!
+          expect(first_span.meta).to include('_dd.tags.process')
+          expect(first_span.meta['_dd.tags.process']).to eq(Datadog::Core::Environment::Process.serialized)
+          # TODO figure out if we need an assertion for the value, ie


@marcotc - do you think there's value in asserting for the values of the tag? Or is the test in process_spec enough?

What you are doing with expect(first_span.meta['_dd.tags.process']).to eq(Datadog::Core::Environment::Process.serialized) seems good to me.

I wouldn't test realistic values.

The main thing to test here is that it's respecting the configuring option, which you did.

The main thing to test here is that it's respecting the configuring option.

Thanks! In that case it doesn't seem like I need to make any changes to the assertions then?

github-actions · 2025-11-11T18:43:19Z

Typing analysis

Note: Ignored files are excluded from the next sections.

Untyped methods

This PR introduces 2 partially typed methods, and clears 1 partially typed method. It increases the percentage of typed methods from 54.48% to 54.58% (+0.1%).

Partially typed methods (+2-1)

❌ Introduced:

sig/datadog/core/tag_normalizer.rbs:11
└── def self.normalize: (untyped original_value, ?remove_digit_start_char: bool) -> ::String
sig/datadog/core/utils.rbs:8
└── def self.utf8_encode: (untyped str, ?binary: bool, ?replace_invalid: bool, ?placeholder: untyped) -> untyped

✅ Cleared:

sig/datadog/core/utils.rbs:8
└── def self.utf8_encode: (untyped str, ?binary: bool, ?placeholder: untyped) -> untyped

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept to the end of the line to remove it from the stats.

…uby conflict with sqlite and it is not needed for this test

…ormalized

lib/datadog/core/normalizer.rb

Co-authored-by: Marco Costa <[email protected]>

lib/datadog/core/normalizer.rb

p-datadog · 2025-11-20T18:13:14Z

This PR needs to be merged/rebased on master to fix the CI failures.

lib/datadog/core/normalizer.rb

lib/datadog/core/tag_normalizer.rb

spec/datadog/core/environment/process_spec.rb

spec/datadog/core/normalizer_spec.rb

spec/datadog/tracing/transport/trace_formatter_spec.rb

spec/datadog/core/environment/process_spec.rb

Co-authored-by: Oleg Pudeyev <[email protected]>

…xist, small fixes, and add comments to the normalizer.rb explaining the expected usage

Co-authored-by: Oleg Pudeyev <[email protected]>

lib/datadog/core/environment/process.rb

lib/datadog/core/normalizer.rb

p-datadog · 2025-11-20T21:35:18Z

lib/datadog/core/environment/process.rb

+    module Environment
+      # Retrieves process level information such that it can be attached to various payloads
+      module Process
+        extend self


We had some debates about module_function which is a different tactic for what extend self is accomplishing here.

If you change the methods to def self.serialized and so on, does everything still work?

The idea is to have the definitions only in one scope - extend self makes them exist in two scopes (the module method as included in classes and on the module namespace itself).

Locally, this refactor still works and I've pushed this commit: a336c66, so now to wait for CI!

Co-authored-by: Oleg Pudeyev <[email protected]>

Strech · 2025-11-20T22:15:11Z

lib/datadog/core/tag_normalizer.rb

+      # @param remove_digit_start_char [Boolean] - whether to remove the leading digit (currently only used for tag values)
+      # @return [String] The normalized string
+      def self.normalize(original_value, remove_digit_start_char: false)
+        transformed_value = original_value.to_s.encode('UTF-8', invalid: :replace, undef: :replace)


I think there is a function for this in Core::Utils#utf8_encode

Strech · 2025-11-20T22:21:57Z

lib/datadog/core/tag_normalizer.rb

+        normalized_value.sub!(leading_invalid_regex, "")
+
+        normalized_value.squeeze!('_') if normalized_value.include?('__')
+        normalized_value.sub!(TRAILING_UNDERSCORES, "")


technically after squeeze we can't have more that _ at the end, do we really need a regex to ditch a single char?

Strech · 2025-11-20T22:24:13Z

lib/datadog/core/tag_normalizer.rb

+        normalized_value = if transformed_value.ascii_only? && transformed_value.length <= MAX_BYTE_SIZE
+          transformed_value
+        else
+          transformed_value.byteslice(0, MAX_BYTE_SIZE).scrub("")
+        end


I think we can streamline this a bit and reduce 1 allocation. And you can go further and don't use different names, because it doesn't matter, just use 1 universal name for the variable.

Suggested change

normalized_value = if transformed_value.ascii_only? && transformed_value.length <= MAX_BYTE_SIZE

transformed_value

else

transformed_value.byteslice(0, MAX_BYTE_SIZE).scrub("")

end

normalized_value = transformed_value

if normalized_value.ascii_only? && normalized_value.length <= MAX_BYTE_SIZE

normalized_value = normalized_value.byteslice(0, MAX_BYTE_SIZE)

normalized_value.scrub!("")

end

Strech · 2025-11-20T22:27:42Z

lib/datadog/core/environment/process.rb

+        def serialized
+          return @serialized if defined?(@serialized)
+          tags = []
+          tags << "#{Environment::Ext::TAG_ENTRYPOINT_WORKDIR}:#{TagNormalizer.normalize(entrypoint_workdir, remove_digit_start_char: false)}" if entrypoint_workdir


Not sure, but when it will be nil?

Strech · 2025-11-20T22:28:57Z

lib/datadog/core/environment/process.rb

+          tags << "#{Environment::Ext::TAG_ENTRYPOINT_WORKDIR}:#{TagNormalizer.normalize(entrypoint_workdir, remove_digit_start_char: false)}" if entrypoint_workdir
+          tags << "#{Environment::Ext::TAG_ENTRYPOINT_NAME}:#{TagNormalizer.normalize(entrypoint_name, remove_digit_start_char: false)}" if entrypoint_name
+          tags << "#{Environment::Ext::TAG_ENTRYPOINT_BASEDIR}:#{TagNormalizer.normalize(entrypoint_basedir, remove_digit_start_char: false)}" if entrypoint_basedir
+          tags << "#{Environment::Ext::TAG_ENTRYPOINT_TYPE}:#{TagNormalizer.normalize(entrypoint_type, remove_digit_start_char: false)}" if entrypoint_type


Same here, it's literally reading the constant ... it's not nil at all

Strech · 2025-11-20T22:31:24Z

Rakefile

    t.pattern = 'spec/**/*_spec.rb'
    t.exclude_pattern = 'spec/**/{appsec/integration,contrib,benchmark,redis,auto_instrument,opentelemetry,open_feature,profiling,crashtracking,error_tracking,rubocop,data_streams}/**/*_spec.rb,' \
-                        ' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature}_spec.rb,' \
+                        ' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature}_spec.rb,spec/datadog/core/environment/process_spec.rb,' \


Well if you read the beginning of the line, it has some globbing that suites you

Suggested change

' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature}_spec.rb,spec/datadog/core/environment/process_spec.rb,' \

' spec/**/{auto_instrument,opentelemetry,process_discovery,stable_config,ddsketch,open_feature,process}_spec.rb' \

or at least core/process

…xtend self on process.rb

…r and update tests to show some new assertions

Add initial attempt at adding process related tags on trace payloads.…

1d8bab2

… This is still missing memoization and additional tests.

github-actions bot added core Involves Datadog core libraries tracing labels Nov 7, 2025

wantsui added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Nov 7, 2025

Add test for multiple calls to the formatter tags

58592a3