Skip to content

Bind vault signed OCR responses to gateway request ID via signed payload#22799

Open
prashantkumar1982 wants to merge 10 commits into
developfrom
fix/vault-signature-payload-request-id-validation
Open

Bind vault signed OCR responses to gateway request ID via signed payload#22799
prashantkumar1982 wants to merge 10 commits into
developfrom
fix/vault-signature-payload-request-id-validation

Conversation

@prashantkumar1982

@prashantkumar1982 prashantkumar1982 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

This change closes a security gap in the vault gateway aggregator where OCR signature validation did not bind a node response to the pending gateway request. JSON-RPC id and SignedOCRResponse wrapper fields are node-controlled and outside the signed bytes, so they cannot be trusted for request correlation.
The fix introduces an authoritative request_id inside the signed OCR payload for vault create/update/delete/list responses, sets it on the vault OCR plugin side, and validates it in the gateway aggregator after signature verification.
Depends on: chainlink-common#2143

Problem

vaulttypes.ValidateSignatures hashes only SignedOCRResponse.Payload + Context. Previously, signed create/update/delete/list payloads contained no gateway request identifier. A node could return a signature-valid response for a different pending request, and the gateway had no way to detect the mismatch using trusted (signed) data.

Solution

chainlink-common (dependency bump)

Bump github.com/smartcontractkit/chainlink-common to v0.11.2-0.20260610205450-f1cc365ddb71, adding request_id to:

  • CreateSecretsResponse
  • UpdateSecretsResponse
  • DeleteSecretsResponse
  • ListSecretIdentifiersResponse

Vault OCR plugin (plugin.go)

When generating JSON OCR reports for create/update/delete/list, clone the outcome response and set RequestId = o.Id before canonical JSON signing. o.Id is the gateway-prefixed pending queue ID (authorizedOwner::userRequestID).

Gateway aggregator (aggregator.go)

  • Signature validation only for vault.secrets.create, vault.secrets.update, vault.secrets.delete, and vault.secrets.list.
  • vault.publicKey.get uses quorum only — it does not go through OCR signed-response validation.
  • After ValidateSignatures, unmarshal the signed payload and compare embedded requestId to the pending gateway request ID (ar.req.ID, also owner-prefixed).
  • Reject when requestId is present and mismatched (log + gateway_vault_request_internal_error metric with signed_payload_request_id_mismatch).
  • Temporarily tolerate missing requestId during rolling vault node upgrades; comment notes this should be tightened once all vault nodes are upgraded.
  • Removed prior checks against JSON-RPC resp.ID / wrapper fields for request binding.

Handler wiring (handler.go)

Pass pending ar.req.ID into Aggregate() so signature validation compares against the gateway-internal prefixed request ID.

Tests

  • Unit tests for signature validation, request ID mismatch rejection, missing request ID tolerance, and GetPublicKey skipping signature validation.
  • Dynamic signing test helpers with requestId embedded in canonical JSON payload.
  • Plugin report tests updated to expect requestId in signed JSON.
  • E2E smoke helpers assert signed payload requestId matches authorizedOwner::userRequestID (not just JSON-RPC response id).

Upgrade / rollout notes

  • Gateway can be deployed before all vault nodes; missing signed requestId is tolerated until vault nodes pick up the OCR plugin change.
  • Once vault nodes are fully upgraded, gateway should start rejecting signed responses with empty requestId.
  • GetPublicKey intentionally remains quorum-based because that path does not use OCR signed reports.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

✅ No conflicts with other open PRs targeting develop

@prashantkumar1982 prashantkumar1982 changed the title Stricter validation of ID in ocr signed responses Bind vault signed OCR responses to gateway request ID via signed payload Jun 10, 2026
@github-actions

Copy link
Copy Markdown
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@trunk-io

trunk-io Bot commented Jun 10, 2026

Copy link
Copy Markdown

Static BadgeStatic BadgeStatic BadgeStatic Badge

Failed Test Failure Summary Logs
TestScripts/admin/profile/test-seconds The test failed because the server could not be connected to, causing commands that depend on it to fail. Logs ↗︎
TestScripts/admin/profile The test failed without providing specific error details, indicating an unspecified failure in the test script. Logs ↗︎

View Full Report ↗︎Docs

@prashantkumar1982 prashantkumar1982 marked this pull request as ready for review June 10, 2026 21:23
@prashantkumar1982 prashantkumar1982 requested review from a team as code owners June 10, 2026 21:23

func (a *baseAggregator) validateUsingSignatures(don capabilities.DON, nodes []capabilities.Node, resp *jsonrpc.Response[json.RawMessage]) (*jsonrpc.Response[json.RawMessage], error) {
func (a *baseAggregator) recordSignedPayloadRequestIDMismatch(ctx context.Context) {
if a.metrics == nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI smell :( Can we remove this? metrics should not be nil in production, I think we should try to enforce that invariant

rep, err := r.generateJSONReport(o.Id, o.RequestType, o.GetCreateSecretsResponse())
createResp := proto.Clone(o.GetCreateSecretsResponse()).(*vaultcommon.CreateSecretsResponse)
createResp.RequestId = o.Id
rep, err := r.generateJSONReport(o.Id, o.RequestType, createResp)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantkumar1982 What's the plan for rolling this out? Don't we need an atomic feature flag here otherwise nodes will generate different outputs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are just direct Gateway -> Vault DON calls, they don't go through don2don.
And since these are ocr signed, the gateway doesn't need quorum. Just any signed valid response is ok.
So I figured no need of flags here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we decided to add gate

}

func sendVaultSignedOCRRequestToGateway(t *testing.T, gatewayURL string, jsonRequest jsonrpc.Request[json.RawMessage]) jsonrpc.Response[vaulttypes.SignedOCRResponse] {
func signedPayloadRequestIDFromPayload(t *testing.T, method string, payload json.RawMessage) string {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty similar to the function introduced in aggregator

Is it worth having function to marshal/unmarshal messages based on the method name as a utility? This switch pattern comes back a lot in relation to unmarshaling

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to a util function.

I didn't yet do the marshal/unmarshal, because it will affect much more files, better to do separately.
Plus not sure how the refactor will look like.

@prashantkumar1982 prashantkumar1982 requested review from a team as code owners June 11, 2026 18:17
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

CORA - Pending Reviewers

Codeowners Entry Overall Num Files Owners
* 💬 7 @smartcontractkit/foundations, @smartcontractkit/core
/core/capabilities/ 💬 2 @smartcontractkit/keystone, @smartcontractkit/capabilities-team
/core/services/ocr*/ 💬 3 @smartcontractkit/foundations, @smartcontractkit/core
go.mod 💬 6 @smartcontractkit/core, @smartcontractkit/foundations
go.sum 💬 6 @smartcontractkit/core, @smartcontractkit/foundations
integration-tests/go.mod 💬 1 @smartcontractkit/core, @smartcontractkit/devex-tooling, @smartcontractkit/foundations
integration-tests/go.sum 💬 1 @smartcontractkit/core, @smartcontractkit/devex-tooling, @smartcontractkit/foundations

Legend: ✅ Approved | ❌ Changes Requested | 💬 Commented | 🚫 Dismissed | ⏳ Pending | ❓ Unknown

For more details, see the full review summary.

russell-stern
russell-stern previously approved these changes Jun 11, 2026
… into fix/vault-signature-payload-request-id-validation
@cl-sonarqube-production

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants