-
Notifications
You must be signed in to change notification settings - Fork 709
otelgrpc: add MetricAttributesFn option to allow adding dynamic attributes on metrics #8191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Can you add documentation or an example_test.go to demonstrate how someone could use this to add attributes to metrics? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8191 +/- ##
=====================================
Coverage 82.0% 82.1%
=====================================
Files 195 195
Lines 13628 13639 +11
=====================================
+ Hits 11188 11199 +11
Misses 2038 2038
Partials 402 402
🚀 New features to boost your workflow:
|
|
Hello @dashpole, |
| // should be centralized, example only | ||
| type retryAttemptKey struct{} | ||
|
|
||
| // a middleware must populate that key with the actual retry attempt, e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be done with an interceptor? Can you expand on this with a more detailed example?
Could users alternatively wrap our stats handler to insert things into the context? IIRC stats handlers have much better performance than interceptors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe they could just implement the TagRPC function in stats.Handler to set things in the context? https://pkg.go.dev/google.golang.org/grpc/stats#Handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be done with an interceptor?
Typically yes, but it's really just an example, the consumer could populate that key depending on preferences/existing patterns in the codebase.
Could users alternatively wrap our stats handler to insert things into the context?
Yes, both approaches could be used to populate values into the context, at least for the client-side approach. If anyone (for whatever reason) would want to communicate such values through the context on the server-side, your approach would probably be the only one that works (we can enter into more details about why this is if it is relevant). The problem with providing such examples on the server side is that I can already see unorthodox usages of such an approach, e.g., adding business logic into the stats handler. For server-side, the current example already demonstrates the recommended pattern for server-side dynamic attributes.
Nevertheless, I'm waiting for your input on this. I could add examples showcasing both approaches for both client- and server-side or even replace the interceptor example altogether on client-side.
That said, I think that the interceptor approach is a bit simpler and possibly better suited for an example as it avoids encouraging complex logic in stats handlers on either side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on this with a more detailed example?
That particular example only shows how we could populate metric attributes with values obtained from the context and the retry attempt was the only thing that popped to me at that time. Do you mean expanding into the actual logic of how we compute and populate the particular values?
The relevant part should be that we fetch the previously populated retry attempt (maybe not the greatest example) from the context and add it as a metric attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought much about how to put things in a gRPC context such that a stats handler can extract them. Do interceptors run before the stats handler? I'm not sure which approach is better, but we should have a recommendation for users if we are introducing this as a feature. Do you know what the benefits of each are?
I'm also not very familiar with the intended use-cases for this feature for clients. Is there anything more useful than retry attempt we could add as an example? Maybe something from w3c baggage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to showcase both approaches. Ideally we should do the work to figure out which approach is recommended.
I like the current server-side example. I'm mainly asking about the client-side example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do interceptors run before the stats handler?
For the client, yes. -> https://github.com/grpc/grpc-go/blob/master/stream.go#L170 and L426 respectively.
For the server, it's the other way around, that's why your wrapper approach is the only one between the two that would work here, although the particular usage might not be a recommended one as specified above. -> https://github.com/grpc/grpc-go/blob/master/server.go#L1815 and L1832 respectively.
This was also validated through some tests.
Do you know what the benefits of each are?
I think that the interceptor approach is simpler, more familiar to most consumers and somehow a bit cleaner. However, a disadvantage is that it only works client side (but I also discourage the particular usage on the server side).
On the other hand, I think that the wrapped handler approach has the advantage that it works on both sides (for symmetry, even if the other approach is discouraged). The disadvantages of this method could be the fact that it introduces more boilerplate and could encourage business logic in the stats handler. It also does not feel like a familiar pattern (at least to me) in the same way the interceptors are.
To conclude, I tend towards giving the example as is (using the interceptor approach for adding values to the context, but left abstracted in the way it already is) due to the issues mentioned above. That said, I think that, outside the example, it's on the API consumer to decide how they inject these values in a manner that they will be accessible to add them as labels. I also think that it's an advantage that we can do this with the wrapped handler approach as well (although more verbose).
Is there anything more useful than retry attempt we could add as an example? Maybe something from w3c baggage?
This is a great suggestion, what are you thinking of? Something like:
otelgrpc.WithMetricAttributesFn(func(ctx context.Context) []attribute.KeyValue {
bag := baggage.FromContext(ctx)
if member := bag.Member("traffic.type"); member.Value() != "" {
return []attribute.KeyValue{
attribute.String("traffic.type", member.Value()), // maybe traffic.type is not the best example here, open to suggestions
}
}
return nil
})Should I add such an example next to the existing client one or should we only keep this one? Waiting for your input here. From my side, I think they complement each other. The baggage approach should work on both sides since the baggage is extracted from incoming headers (just like metadata) if I remember correctly.
I tried to come up with reasonable examples on the client-side, though I admit that we needed this feature for the server-side and I may have a clearer example there.
I'd also like to mention that no matter the approach chosen for the examples, I'll add corresponding "~regression" tests in metric_attributes_test.go for each approach (I did not initially think of the wrapper approach).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To also address this:
IIRC stats handlers have much better performance than interceptors.
In this particular scenario, according to the benchmarks I ran there is no performance difference between the two approaches.
goos: darwin
goarch: arm64
pkg: go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
cpu: Apple M1 Pro
BenchmarkMetricAttributesFn_Interceptor-10 17338 68307 ns/op 12025 B/op 178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10 18042 67800 ns/op 12028 B/op 178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10 17979 67799 ns/op 12027 B/op 178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10 17793 67460 ns/op 12027 B/op 178 allocs/op
BenchmarkMetricAttributesFn_Interceptor-10 17786 68142 ns/op 12028 B/op 178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10 17941 68586 ns/op 12029 B/op 178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10 17810 66865 ns/op 12028 B/op 178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10 18169 69433 ns/op 12028 B/op 178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10 16158 70838 ns/op 12031 B/op 178 allocs/op
BenchmarkMetricAttributesFn_WrappedHandler-10 18102 68636 ns/op 12028 B/op 178 allocs/op
PASS
ok go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc 19.772sI can follow-up on this with more technical details if necessary, but I think we can disregard the performance considerations and only focus on the other mentioned advantages/disadvantages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add such an example next to the existing client one or should we only keep this one?
I'm OK with an additional example. Lets do that then: One example using an interceptor, and another for baggage.
This PR addresses the need of adding dynamic attributes on the auto-instrumented metrics from otelgrpc. It potentially addresses #6026 without giving access to the payload since it is not available during every phase which adds complexity when trying to add the labels to all available metrics. It might also generate more confusion to the end user by exposing internal implementation details.
Initial draft that implemented the Labeler pattern from otelhttp (it also explains why it was closed): #8135