Skip to content

feat(process-tags): signal service name source via svc.user/svc.auto#3921

Draft
Leiyks wants to merge 6 commits into
masterfrom
leiyks/svc-source-process-tags
Draft

feat(process-tags): signal service name source via svc.user/svc.auto#3921
Leiyks wants to merge 6 commits into
masterfrom
leiyks/svc-source-process-tags

Conversation

@Leiyks
Copy link
Copy Markdown
Contributor

@Leiyks Leiyks commented May 27, 2026

Summary

Implements the PHP-tracer side of RFC Signal Service Name Source via Process Tags.

Surfaces one of two mutually-exclusive process tags so the Datadog backend can distinguish user-set vs tracer-auto-resolved service names:

  • svc.user:true — when DD_SERVICE is non-empty (env var, INI, OTEL fallback, remote config)
  • svc.auto:<default_service_name> — when DD_SERVICE is empty and the tracer auto-resolved the name (e.g. web.request, script basename, cli.command)

Per the RFC caveats, no conclusions are drawn from the absence of both tags.

Architecture

Process tags are per-process; the active service name in PHP is per-request (mutable via ini_set, OTEL fallbacks, remote-config-driven INI updates). The earlier approach of baking svc.user/svc.auto into the static process_tags string leaked the latest request's override into subsequent FPM requests served by the same worker — addressed by the senior review.

Two cooperating paths now:

1. Per-span emission (traces) — ext/serializer.c

In ddtrace_serialize_span_to_rust_span, after attaching the static _dd.tags.process string, the per-span svc.user:true / svc.auto:<normalized-default> tag is appended based on the request-active state (get_DD_SERVICE() at serialization time, normalized via ddog_normalize_process_tag_value). Each span sees its own request's state — cross-request leak is impossible by construction.

2. Sidecar (telemetry / remote config / runtime info) — ext/sidecar.c

ddtrace_sidecar_update_process_tags now also calls the new libdatadog FFI ddog_sidecar_session_set_default_service_name(transport, default_service_name):

  • Empty default_service_name → sidecar treats it as user-defined and injects svc.user:true
  • Non-empty default_service_name (pre-normalized) → sidecar injects svc.auto:<default>

The sidecar performs the injection at payload emission time (via the new SessionInfo::process_tags_with_svc_source() helper), so all of telemetry / remote config / runtime_info / stats payloads see consistent svc.* tagging without the tracer having to bake it into the static string.

The libdatadog half lives in DataDog/libdatadog#2053; the submodule pointer in this PR is bumped to that branch tip and must be moved to the merged SHA before this PR merges.

Behavior

Source of service name Span _dd.tags.process Sidecar-emitted process_tags
DD_SERVICE=... env var svc.user:true svc.user:true
datadog.service=... system INI svc.user:true svc.user:true
OTEL_SERVICE_NAME=... (fallback) svc.user:true svc.user:true
OTEL_RESOURCE_ATTRIBUTES=service.name=... (fallback) svc.user:true svc.user:true
ini_set('datadog.service', ...) (per-request) svc.user:true (this span only) unchanged (process-level)
No DD_SERVICE set svc.auto:<basename>.php (CLI) / svc.auto:web.request (web) same

Companion PR

DataDog/libdatadog#2053 — exposes ddog_sidecar_session_set_default_service_name and the sidecar-side injection logic. This PR's CI will be red on the submodule pointer until #2053 lands.

@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 27, 2026

Pipelines  Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🚦 11 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [7.4]   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). 1 failed test. Crashtracker collects all threads when collect_all_threads is enabled in ext/crashtracker_collect_all_threads.phpt.

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.0]   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). 1 failed test due to unexpected behavior: Crashtracker collects all threads when collect_all_threads is enabled in crashtracker_collect_all_threads.phpt.

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.1]   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). 1 failed test. Crashtracker collects all threads when collect_all_threads is enabled at tmp/build_extension/tests/ext/crashtracker_collect_all_threads.phpt.

View all 11 failed jobs.

❄️ 1 New flaky test detected

testSvcTagDoesNotLeakBetweenRequests from custom-framework-autoloading-test.DDTrace\Tests\Integrations\Custom\Autoloaded\ProcessTagsWebTest   View in Datadog (Fix with Cursor)
DDTrace\Tests\Integrations\Custom\Autoloaded\ProcessTagsWebTest::testSvcTagDoesNotLeakBetweenRequests
Error: Call to undefined method DDTrace\Tests\Integrations\Custom\Autoloaded\ProcessTagsWebTest::assertStringContainsString()

tests/Integrations/Custom/Autoloaded/ProcessTagsWebTest.php:82
tests/Common/RetryTraitVersionSpecific70.php:28
phpvfscomposer://tests/vendor/phpunit/phpunit/phpunit:52

New test introduced in this PR is flaky.

View in Flaky Test Management

ℹ️ Info

No other issues found (see more)

🧪 All tests passed

🔄 Datadog auto-retried 1 job - 1 passed on retry View in Datadog

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 54.12% (-0.04%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 84b004e | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-06-02 15:21:16

Comparing candidate commit 84b004e in PR branch leiyks/svc-source-process-tags with baseline commit cd4e9ba in branch master.

Found 0 performance improvements and 6 performance regressions! Performance is the same for 188 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:EmptyFileBench/benchEmptyFileDdprof

  • 🟥 execution_time [+87.830µs; +277.330µs] or [+2.383%; +7.526%]

scenario:PHPRedisBench/benchRedisOverhead

  • 🟥 execution_time [+44.727µs; +76.804µs] or [+4.693%; +8.058%]

scenario:SamplingRuleMatchingBench/benchRegexMatching1

  • 🟥 execution_time [+41.615ns; +122.585ns] or [+2.855%; +8.409%]

scenario:SamplingRuleMatchingBench/benchRegexMatching2

  • 🟥 execution_time [+78.496ns; +143.904ns] or [+5.402%; +9.904%]

scenario:SamplingRuleMatchingBench/benchRegexMatching3

  • 🟥 execution_time [+69.319ns; +138.281ns] or [+4.802%; +9.578%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4

  • 🟥 execution_time [+59.516ns; +149.284ns] or [+4.087%; +10.251%]

Leiyks added 4 commits June 2, 2026 12:38
Emit one of two mutually-exclusive process tags so the Datadog backend
can distinguish user-set vs tracer-resolved service names:

- `svc.user:true` when DD_SERVICE is non-empty (env, INI, OTEL fallback,
  remote config)
- `svc.auto:<default_service_name>` when DD_SERVICE is empty and the
  tracer auto-resolved the name

Runtime mutations of `datadog.service` (via `ini_set` or remote config)
trigger a recompute. The new `ddtrace_process_tags_reload_with_service`
override threads the ZAI `new_str` through because the value is not yet
committed to `get_DD_SERVICE()` at the time the `ini_change` callback
fires.

Implements: RFC "Signal Service Name Source via Process Tags"
https://docs.google.com/document/d/1c47iSTWxIOHMHfZTF2nT9xfyQaIBP9KJvI9sRn5SvpM
Addresses senior review on the prior PR commit. Process tags are
per-process (set once, propagated by the sidecar), but the active
service name in PHP is request-local (mutable via `ini_set` and OTEL/RC
fallbacks). Baking `svc.user`/`svc.auto` into the static process_tags
string leaked the latest request's override into subsequent FPM requests.

Two cooperating paths now:

1. **Per-span** (`ext/serializer.c::ddtrace_serialize_span_to_rust_span`):
   computes svc.user/svc.auto from `get_DD_SERVICE()` at serialization
   time and appends to that span's `_dd.tags.process`. Each span sees
   exactly its own request's state — no cross-request leak.

2. **Sidecar** (`ext/sidecar.c::ddtrace_sidecar_update_process_tags`):
   sends the process-level svc source to libdatadog via the new
   `ddog_sidecar_session_set_default_service_name` FFI. The sidecar
   injects svc.user/svc.auto into outgoing telemetry/RC/runtime_info
   payloads at emission time, eliminating the static-string conflict.

The libdatadog half is in DataDog/libdatadog#2053; the submodule is
bumped here to that commit.

Reverts the static svc.* emission and `ddtrace_alter_dd_service` reload
hook from 5a55f2d.

Tests:
- 5 new `.phpt` tests (CLI per-span correctness incl. ini_set + ini_restore)
- New PHPUnit `testSvcTagDoesNotLeakBetweenRequests` against the FPM
  weblog: two sequential requests on the same worker prove svc.* reflects
  per-request state with no leak.

Implements: RFC "Signal Service Name Source via Process Tags"
https://docs.google.com/document/d/1c47iSTWxIOHMHfZTF2nT9xfyQaIBP9KJvI9sRn5SvpM
Addresses review feedback on PR:
- Wrap `ddog_sidecar_session_set_default_service_name` calls in
  `ddtrace_ffi_try` so transport errors surface in the trace log
  instead of being silently dropped.
- Use `DDOG_CHARSLICE_C("")` instead of hand-rolled CharSlice struct
  literal for the user-defined case (matches the rest of sidecar.c).
- Call `ddtrace_sidecar_update_process_tags()` at the end of
  `ddtrace_sidecar_handle_fork` so the child's fresh sidecar session
  re-learns the svc.* source after fork; without this, child
  telemetry/RC/stats payloads would drop the svc.* tag entirely
  until the next external trigger.

Submodule bump picks up the companion stats-payload fix in
DataDog/libdatadog#2053.
@Leiyks Leiyks force-pushed the leiyks/svc-source-process-tags branch from cfaa401 to a210c1c Compare June 2, 2026 10:39
…ection

The sidecar now appends `svc.auto:<default_service_name>` to outgoing
process_tags payloads (telemetry, remote config, live debugger) when
DD_SERVICE is unset. Update the three expectation strings to include
the new tag.

Verified telemetry test locally; the RC and live-debugger tests use
the same injection mechanism and will be exercised in CI.
Comment thread ext/serializer.c Outdated
}
if (has_user_service || svc_auto_normalized) {
smart_str combined = {0};
smart_str_appendl(&combined, ZSTR_VAL(process_tags), ZSTR_LEN(process_tags));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
smart_str_appendl(&combined, ZSTR_VAL(process_tags), ZSTR_LEN(process_tags));
smart_str_append(&combined, process_tags);

simpler

Comment thread ext/sidecar.c
Comment on lines +154 to +155
zend_string *dd_service = get_DD_SERVICE();
if (dd_service && ZSTR_LEN(dd_service) > 0) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DD_SERVICE is dynamic per request, not per session.

Some request may have it, another not.

You probably can store ddtrace_default_service_name() on the sidecar as session bound (at which point setting that one via ddog_sidecar_session_set_config is enough), but the DD_SERVICE may or may not exist for any particular request, and change between requests. You may have to submit that one alongside with ddog_sidecar_set_universal_service_tags().

... let's talk about that?!

Addresses CI failures in dynamic_config_* and debugger_* tests. The
sidecar was injecting svc.auto into the Vec<Tag> stored in the RC
Target, but the tracer's RC read side (sidecar.c:808) uses bare
process_tags — Target hashes diverged → SHM lookups missed → no RC
configs delivered to userland.

Companion libdatadog commit reverts the sidecar-side injection on the
RC path. Trace per-span svc.* and telemetry svc.* are unaffected.

Also addresses bwoebi review comment: simplify smart_str_appendl to
smart_str_append in ext/serializer.c.

RC/live-debugger payloads will need a libdatadog follow-up (Target
identity vs. payload metadata split, or custom Hash/Eq) before they
can carry svc.* per the RFC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants