feat: Implement TracingChannel Proposal#3195
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Pull request overview
Implements Node.js diagnostics_channel TracingChannel hooks to expose trace events for Redis command execution and client connect operations, enabling APM instrumentation without monkey-patching.
Changes:
- Added
traceCommand/traceConnecthelpers and trace context types fornode-redis:commandandnode-redis:connect. - Wrapped
connect(),sendCommand(), pipeline, and MULTI execution paths with tracing. - Added specs validating emitted tracing events and batch metadata (MULTI / pipeline).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| packages/client/lib/client/tracing.ts | Introduces tracing channel setup and helper wrappers for promise-based spans. |
| packages/client/lib/client/index.ts | Instruments connect + command execution paths, and adds trace context construction. |
| packages/client/lib/client/tracing.spec.ts | Adds tests that subscribe to tracing sub-channels and assert emitted contexts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const client = RedisClient.create({ | ||
| socket: { | ||
| port: 6379, | ||
| host: '127.0.0.1' | ||
| } | ||
| }); |
There was a problem hiding this comment.
This test also hard-codes 127.0.0.1:6379. In this repo, Redis connection details are typically provided by testUtils/GLOBAL.SERVERS.*; using a fixed port/host is likely to break when tests run against the dockerized server. Consider structuring this as a testUtils.testWithClient with disableClientSetup: true, so you can create the client with the test server’s socket options and still assert connect() tracing events.
packages/client/lib/client/index.ts
Outdated
| const socketOptions = this._self.#options.socket; | ||
| await traceConnect( | ||
| () => this._self.#socket.connect(), | ||
| { | ||
| serverAddress: (socketOptions as any)?.host ?? 'localhost', | ||
| serverPort: (socketOptions as any)?.port ?? 6379 | ||
| } |
There was a problem hiding this comment.
serverAddress/serverPort are derived from socketOptions.host/socketOptions.port with any casts and defaults. RedisSocketOptions also supports IPC sockets (e.g. path) where host/port are not meaningful; emitting localhost:6379 in those cases would be incorrect. Consider narrowing the socket options type (e.g. handle 'path' in socketOptions separately) and making the emitted port optional/nullable when connecting via IPC.
| return { | ||
| command: String(args[0]).toUpperCase(), | ||
| args, |
There was a problem hiding this comment.
#commandTraceContext assigns args by reference. Some call paths reuse and mutate the same args array (e.g. script fallback that rewrites args[0] from EVALSHA to EVAL), which can make the tracing payload reflect the mutated args instead of what was actually sent for the traced command. To keep trace metadata accurate, consider snapshotting the args (shallow copy) for the tracing context or ensuring callers don’t mutate arrays after passing them into sendCommand.
| return { | |
| command: String(args[0]).toUpperCase(), | |
| args, | |
| const argsSnapshot = Array.from(args); | |
| return { | |
| command: String(argsSnapshot[0]).toUpperCase(), | |
| args: argsSnapshot, |
There was a problem hiding this comment.
Might not be the case here, but it's actually intentional that diagnostics_channel be able to modify the given data in some cases. The main use case is adding distributed tracing headers to requests, but may be desired for other things too. Unless you have a compelling reason to do so, you can generally pass mutable data directly into it.
There was a problem hiding this comment.
Don't have a strong opinion here, most of the tracing channels instrumentations I worked on didn't require mutable contexts but could be good to have.
EDIT: Misunderstood the review 🤦♂️
packages/client/lib/client/index.ts
Outdated
| this._self.#scheduleWrite(); | ||
| return promise; | ||
| }, | ||
| this._self.#commandTraceContext(args) |
There was a problem hiding this comment.
this._self.#commandTraceContext(args) is evaluated eagerly, so the context object (and the String(args[0]).toUpperCase() work) is created for every command even when there are no tracing subscribers. This undercuts the “zero-cost when no subscribers are registered” goal. Consider making traceCommand accept a lazy context factory, and only building the context when tracing is actually enabled (e.g. when the tracing channel’s sub-channels have subscribers).
| this._self.#commandTraceContext(args) | |
| () => this._self.#commandTraceContext(args) |
There was a problem hiding this comment.
Yes, you may want to pass a context builder in so you can use hasSubscribers to only build the context when it would be published.
There was a problem hiding this comment.
Yep, addressed it. Now the trace helpers accept a factory for context building.
|
FYI @Qard |
Context objects are now only built when there are active subscribers, ensuring true zero-cost when no APM is listening. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
IPC sockets emit the path as serverAddress with undefined serverPort, instead of misleadingly reporting localhost:6379. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded 127.0.0.1:6379 with testUtils.testWithClient and disableClientSetup so tests work in CI with Docker-assigned ports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Qard
left a comment
There was a problem hiding this comment.
LGTM. Copilot already caught most of what I would have had to say about it. Might want to modify the traceConnect and traceCommand functions to take builder functions to the context objects and then use if (channel.hasSubscribers) { ... } to skip the context build and publish when there are no subscribers.
packages/client/lib/client/index.ts
Outdated
| { | ||
| ...this._self.#commandTraceContext(args), | ||
| batchMode: 'PIPELINE', | ||
| batchSize | ||
| } |
There was a problem hiding this comment.
Might want a context builder function for this too.
packages/client/lib/client/index.ts
Outdated
| { | ||
| ...this._self.#commandTraceContext(args), | ||
| batchMode: 'MULTI', | ||
| batchSize | ||
| } |
There was a problem hiding this comment.
Thanks, I addressed that in latest commits.
There was a problem hiding this comment.
Yep, looks like you were addressing it while I was reviewing the prior code at the same time. 😅
I've reviewed again. All looks good to me now. 🙂
|
Feel free to port it there too. It's valuable to have diagnostics_channel built in to all the libraries. Thanks for working on this! ❤️ |
Exports CommandTraceContext, BatchCommandTraceContext, and ConnectTraceContext from @redis/client so APM libraries can type their channel subscribers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| assert.equal(errors[0].serverPort, 1); | ||
| } finally { | ||
| dc.unsubscribe('tracing:node-redis:connect:error', onError); | ||
| await badClient.destroy(); |
There was a problem hiding this comment.
Test destroys already-closed client in finally block
Low Severity
In the connection error test, await badClient.destroy() in the finally block will throw ClientClosedError because the client is already closed after connect() fails with reconnectStrategy: false. The socket
's #isOpen is set to false by #shouldReconnect when the strategy returns false, so calling destroy() on an already-closed client throws. The other connect test correctly guards with if (client.isOpen) before destroying, but this test does not.
| ? (process as any).getBuiltinModule('node:diagnostics_channel') | ||
| : require('node:diagnostics_channel'); | ||
|
|
||
| const hasTracingChannel = typeof dc.tracingChannel === 'function'; |
There was a problem hiding this comment.
You could use the dc-polyfill package from npm. We built it to backport modern diagnostics channel functionality to older versions of Node.js. That may alleviate the checks you have for the presence of tracing channel. But of course that adds a new dependency that node-redis might not want.
There was a problem hiding this comment.
@tlhunter zero dependency is what makes this attractive for 3rd party authors I think.
Not sure if it works but we could potentially override getBuiltinModule and intercept it, kinda like IITM except it wouldn't need module hooks and can live on the APM's SDK side:
something like this:
process.getBuiltinModule = function(id) {
// Check for both the prefixed and unprefixed module ID
if (id === 'node:diagnostics_channel' || id === 'diagnostics_channel') {
return dcPolyfill;
}
// Fall back to original behavior for all other modules
return originalGetBuiltin.call(process, id);
}Would that work? it kinda imposes them to use getBuiltInModule to load in tracing channels but I found that to be common across several libraries like this one due to compatibility concerns with both ESM and older node releases, pg is one example.
There was a problem hiding this comment.
I think what you currently have is better
Gracefully handle environments where diagnostics_channel is unavailable instead of throwing at module load. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On older Node.js versions that support TracingChannel but lack the hasSubscribers property, the previous truthiness check would silently skip tracing. Now we check explicitly for `false` so that tracing is performed unconditionally when hasSubscribers is unavailable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @logaretm & all, thank you for the PR. Have you had a chance to look at the OTel metrics PR? It should be merged into master very soon. We are currently considering adding native OpenTelemetry tracing support as well, ideally building on top of the same OTel integration. From a maintenance perspective, maintaining separate implementations for metrics and tracing would increase the complexity of the driver. For that reason, our current preference would be to extend the OTEL integration so that tracing builds on top of the upcoming metrics work, rather than introducing a separate diagnostics_channel–based approach. If native OTel tracing support were available, would that address your use case? As far as we know, most APM platforms already support OTLP. |
|
Hi there, creator of diagnostics_channel (and subsequently TracingChannel) here. 👋🏻 It was actually created specifically for this purpose. Many libraries, and Node.js core itself, did not want to pull in a large dependency on OpenTelemetry or to be tied to its API model. I created diagnostics_channel specifically to enable decoupling the sharing of Observability data from specific consumer APIs so libraries could easily add tracing without additional dependencies. While much of the market has adopted OpenTelemetry just on hype and the resulting market pressure produced by that hype, diagnostics_channel has also been adopted by most Node.js vendors, and without any hype pushing that along artificially. It's built to be extremely low-cost, being zero cost unless there are actually active subscribers, while the OpenTelemetry API often has actually quite a bit of overhead. 🙈 A lot of performance-focused libraries like fastify and undici have gone for diagnostics_channel-based instrumentations instead and building OpenTelemetry plugins on top of that as it performs a lot better and is a lot less invasive of a change. |
|
Hey @PavelPashov, thanks for the response. I totally understand the concern about maintaining two separate systems. Building on what @Qard said above, I wanted to show how TracingChannel doesn't compete with OTel, it actually reduces the OTel maintenance burden on your team. The command duration measurement in #3110 already follows a start/end pattern that TracingChannel handles natively: // Current approach in #3110 — manual start/end wired into every callsite:
const recordOperation = OTelMetrics.instance.commandMetrics
.createRecordOperationDuration(args, clientId);
// ...execute...
recordOperation(err);
// TracingChannel — single instrumentation point:
tracePromise(commandChannel, { command, args, serverAddress, ... }, promise);
// start/asyncEnd/error lifecycle managed by Node.jsIf node-redis emits lifecycle events via TracingChannel:
This is the same architecture Worth noting: OTel JS SDK v3 is planning to raise the minimum Node.js version to 22. Users who want to actually use node-redis's OTel metrics or tracing need SDK packages ( I'm happy to rework this PR to sit underneath the metrics layer and power it, and show how it can be used to set up the tracing integration for the future. What do you think? |


This PR implements tracing channels for
commandandconnectoperations. This allows APM libraries (OTEL, Datadog, Sentry) can instrument redis commands without monkey-patching.It also enables users can setup their own custom instrumentations and analytics.
Tracing Channels
All channels in this PR use the Node.js
TracingChannelAPI, which providesstart,end,asyncStart,asyncEnd, anderrorsub-channels automatically.node-redis:commandcommand,args,database,serverAddress,serverPort, and optionallybatchMode,batchSizefor MULTI/pipeline commandsnode-redis:connectserverAddress,serverPortFeel free to suggest different names but most conventions follow the lib name, or they append a namespace before it as well like
redis:node-redis:connect, so LMK which one feels suitable.Context Properties
I have closely followed what OTEL currently extracts as attributes in their instrumentations and made it available in the event message payload.
The payload sent in the trace events includes the following:
commandargs[0](uppercased)db.operation.nameargsRedisArgument[]db.query.text(APM serializes/sanitizes)databasethis.#selectedDBdb.namespaceserverAddressserver.addressserverPortserver.portFor batch commands we include these properties:
batchMode'MULTI'or'PIPELINE'db.operation.nameprefixbatchSizecommands.lengthdb.operation.batch.sizeBackward Compatibility
Zero-cost when no subscribers are registered. Silently skipped on Node.js versions where
TracingChannelis unavailable (e.g. Node 18). This is possible withprocess.getBuiltinModule.closes #2590
Note
Medium Risk
Touches core client command/connection paths by wrapping
connect,sendCommand,MULTI, and pipeline execution in tracing helpers; while intended to be no-op without subscribers, it could still affect performance or edge-case behavior across Node versions.Overview
Adds Node.js
diagnostics_channelTracingChannelinstrumentation for Redis client connect and command execution.connect()and all command enqueue paths (standalonesendCommand,MULTI, and pipeline execution) are wrapped to emitnode-redis:connectandnode-redis:commandtracing events with context (command,args,database,serverAddress,serverPort, plusbatchMode/batchSizefor batches), and the trace context types are exported from the package entrypoint.Introduces
lib/client/tracing.ts(feature-gated whenTracingChannelis unavailable and optimized to skip work when there are no subscribers) and adds coverage intracing.spec.tsfor success, error, and batch scenarios.Written by Cursor Bugbot for commit ce03e84. This will update automatically on new commits. Configure here.