Skip to content

refactor(driver): trim compute capability response#1402

Open
elezar wants to merge 1 commit into
mainfrom
cleanup-driver-capabilities-en
Open

refactor(driver): trim compute capability response#1402
elezar wants to merge 1 commit into
mainfrom
cleanup-driver-capabilities-en

Conversation

@elezar
Copy link
Copy Markdown
Member

@elezar elezar commented May 15, 2026

Summary

Remove unused GPU availability fields from the internal compute driver capability response. GPU admission remains driver-local and is still validated when a sandbox create request asks for GPU resources.

This intentionally breaks wire compatibility for the internal compute-driver capability message by deleting the removed fields without reserving their field numbers or names.

Related Issue

None.

Changes

  • Removed the compute-driver capability proto fields for GPU support and GPU count without reserving their tags or names.
  • Stopped populating GPU capability fields from Docker, Kubernetes, Podman, VM, and test drivers.
  • Documented that the capability RPC reports driver identity, version, and default image only.
  • Addressed clippy fallout from the now-sync Kubernetes capability path.

Testing

  • cargo check -p openshell-core -p openshell-driver-docker -p openshell-driver-kubernetes -p openshell-driver-podman -p openshell-driver-vm -p openshell-server
  • cargo test -p openshell-server -p openshell-driver-docker -p openshell-driver-kubernetes -p openshell-driver-podman -p openshell-driver-vm --no-run
  • mise run pre-commit passes
  • Unit tests not applicable; no behavior change beyond removing unused capability fields
  • E2E tests not applicable; no sandbox runtime behavior change

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)

@elezar elezar requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 15, 2026 15:02
Comment thread proto/compute_driver.proto Outdated
Comment on lines +48 to +49
reserved 4, 5;
reserved "supports_gpu", "gpu_count";
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drew would a breaking API change (simply removing the fields) be better in this case. I think we can add additional information as part of the resource requirement changes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not drew, but I think we are still in a state where we can accept some breaking changes while we are getting to the right shape of the API contract. This feels okay to me to do a breaking change here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my understanding too. Let me update this to rather remove the fields as a breaking change. It should make some of the changes in the RFC proposed in #1360 a little cleaner to implement.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the breaking proto changes.

@elezar elezar mentioned this pull request May 20, 2026
5 tasks
Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the cleanup-driver-capabilities-en branch from f76f3f4 to cd6c738 Compare May 20, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants