Skip to content

Conversation

@vlad-coman-hs
Copy link

This PR addresses the need of adding dynamic attributes on the auto-instrumented metrics from otelgrpc. It potentially addresses #6026 without giving access to the payload since it is not available during every phase which adds complexity when trying to add the labels to all available metrics. It might also generate more confusion to the end user by exposing internal implementation details.

Initial draft that implemented the Labeler pattern from otelhttp (it also explains why it was closed): #8135

@vlad-coman-hs vlad-coman-hs requested review from a team and dashpole as code owners November 19, 2025 09:37
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 19, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: vlad-coman-hs / name: Vlad Coman (2b6bf59, 33a4501)
  • ✅ login: vlad-coman-hs / name: vlad-coman-hs (86a334a)

@dashpole
Copy link
Contributor

Can you add documentation or an example_test.go to demonstrate how someone could use this to add attributes to metrics?

@codecov
Copy link

codecov bot commented Nov 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.1%. Comparing base (76c48fd) to head (86a334a).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #8191   +/-   ##
=====================================
  Coverage   82.0%   82.1%           
=====================================
  Files        195     195           
  Lines      13628   13639   +11     
=====================================
+ Hits       11188   11199   +11     
  Misses      2038    2038           
  Partials     402     402           
Files with missing lines Coverage Δ
...entation/google.golang.org/grpc/otelgrpc/config.go 87.3% <100.0%> (+1.0%) ⬆️
...n/google.golang.org/grpc/otelgrpc/stats_handler.go 98.5% <100.0%> (+<0.1%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vlad-coman-hs
Copy link
Author

Hello @dashpole,
Just wanted to check in on this PR. Please let me know if there's anything I should adjust or clarify.
Thank you for your time!

// should be centralized, example only
type retryAttemptKey struct{}

// a middleware must populate that key with the actual retry attempt, e.g.,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be done with an interceptor? Can you expand on this with a more detailed example?

Could users alternatively wrap our stats handler to insert things into the context? IIRC stats handlers have much better performance than interceptors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe they could just implement the TagRPC function in stats.Handler to set things in the context? https://pkg.go.dev/google.golang.org/grpc/stats#Handler

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be done with an interceptor?

Typically yes, but it's really just an example, the consumer could populate that key depending on preferences/existing patterns in the codebase.

Could users alternatively wrap our stats handler to insert things into the context?

Yes, both approaches could be used to populate values into the context, at least for the client-side approach. If anyone (for whatever reason) would want to communicate such values through the context on the server-side, your approach would probably be the only one that works (we can enter into more details about why this is if it is relevant). The problem with providing such examples on the server side is that I can already see unorthodox usages of such an approach, e.g., adding business logic into the stats handler. For server-side, the current example already demonstrates the recommended pattern for server-side dynamic attributes.

Nevertheless, I'm waiting for your input on this. I could add examples showcasing both approaches for both client- and server-side or even replace the interceptor example altogether on client-side.
That said, I think that the interceptor approach is a bit simpler and possibly better suited for an example as it avoids encouraging complex logic in stats handlers on either side.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this with a more detailed example?

That particular example only shows how we could populate metric attributes with values obtained from the context and the retry attempt was the only thing that popped to me at that time. Do you mean expanding into the actual logic of how we compute and populate the particular values?

The relevant part should be that we fetch the previously populated retry attempt (maybe not the greatest example) from the context and add it as a metric attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought much about how to put things in a gRPC context such that a stats handler can extract them. Do interceptors run before the stats handler? I'm not sure which approach is better, but we should have a recommendation for users if we are introducing this as a feature. Do you know what the benefits of each are?

I'm also not very familiar with the intended use-cases for this feature for clients. Is there anything more useful than retry attempt we could add as an example? Maybe something from w3c baggage?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to showcase both approaches. Ideally we should do the work to figure out which approach is recommended.

I like the current server-side example. I'm mainly asking about the client-side example

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do interceptors run before the stats handler?

For the client, yes. -> https://github.com/grpc/grpc-go/blob/master/stream.go#L170 and L426 respectively.
For the server, it's the other way around, that's why your wrapper approach is the only one between the two that would work here, although the particular usage might not be a recommended one as specified above. -> https://github.com/grpc/grpc-go/blob/master/server.go#L1815 and L1832 respectively.
This was also validated through some tests.

Do you know what the benefits of each are?

I think that the interceptor approach is simpler, more familiar to most consumers and somehow a bit cleaner. However, a disadvantage is that it only works client side (but I also discourage the particular usage on the server side).

On the other hand, I think that the wrapped handler approach has the advantage that it works on both sides (for symmetry, even if the other approach is discouraged). The disadvantages of this method could be the fact that it introduces more boilerplate and could encourage business logic in the stats handler. It also does not feel like a familiar pattern (at least to me) in the same way the interceptors are.

To conclude, I tend towards giving the example as is (using the interceptor approach for adding values to the context, but left abstracted in the way it already is) due to the issues mentioned above. That said, I think that, outside the example, it's on the API consumer to decide how they inject these values in a manner that they will be accessible to add them as labels. I also think that it's an advantage that we can do this with the wrapped handler approach as well (although more verbose).

Is there anything more useful than retry attempt we could add as an example? Maybe something from w3c baggage?

This is a great suggestion, what are you thinking of? Something like:

otelgrpc.WithMetricAttributesFn(func(ctx context.Context) []attribute.KeyValue {
    bag := baggage.FromContext(ctx)
    if member := bag.Member("traffic.type"); member.Value() != "" {
        return []attribute.KeyValue{
            attribute.String("traffic.type", member.Value()), // maybe traffic.type is not the best example here, open to suggestions
        }
    }
    return nil
})

Should I add such an example next to the existing client one or should we only keep this one? Waiting for your input here. From my side, I think they complement each other. The baggage approach should work on both sides since the baggage is extracted from incoming headers (just like metadata) if I remember correctly.

I tried to come up with reasonable examples on the client-side, though I admit that we needed this feature for the server-side and I may have a clearer example there.

I'd also like to mention that no matter the approach chosen for the examples, I'll add corresponding "~regression" tests in metric_attributes_test.go for each approach (I did not initially think of the wrapper approach).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To also address this:

IIRC stats handlers have much better performance than interceptors.

In this particular scenario, according to the benchmarks I ran there is no performance difference between the two approaches.

goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
cpu: Apple M1 Pro
BenchmarkMetricAttributesFn_Interceptor-10                 17338             68307 ns/op           12025 B/op        178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10                 18042             67800 ns/op           12028 B/op        178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10                 17979             67799 ns/op           12027 B/op        178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10                 17793             67460 ns/op           12027 B/op        178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10                 17786             68142 ns/op           12028 B/op        178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10              17941             68586 ns/op           12029 B/op        178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10              17810             66865 ns/op           12028 B/op        178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10              18169             69433 ns/op           12028 B/op        178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10              16158             70838 ns/op           12031 B/op        178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10              18102             68636 ns/op           12028 B/op        178 allocs/op
PASS
ok      go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc     19.772s

I can follow-up on this with more technical details if necessary, but I think we can disregard the performance considerations and only focus on the other mentioned advantages/disadvantages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add such an example next to the existing client one or should we only keep this one?

I'm OK with an additional example. Lets do that then: One example using an interceptor, and another for baggage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants