Skip to content

testserver: 404 on permissions GET when V2 parent is gone#5186

Merged
janniklasrose merged 1 commit intomainfrom
janniklasrose/vsendpoint-integ
May 6, 2026
Merged

testserver: 404 on permissions GET when V2 parent is gone#5186
janniklasrose merged 1 commit intomainfrom
janniklasrose/vsendpoint-integ

Conversation

@janniklasrose
Copy link
Copy Markdown
Contributor

@janniklasrose janniklasrose commented May 6, 2026

Changes

testserver.GetPermissions now returns 404 when the parent object backing
a permissions request is gone, defaulting to V2 permissions API behavior.
The check is wired up for vector-search-endpoints only; other resource
types fall through to the existing "empty ACL on miss" branch.

The acceptance test
bundle/resources/vector_search_endpoints/drift/recreated_same_name
now asserts create (instead of update) for the permissions resource
when the parent endpoint is recreated remotely with a different UUID, and
the recorded output.txt is regenerated to match.

Why

The integration variant of the test was failing with:

recreate vector_search_endpoints.my_endpoint
-update vector_search_endpoints.my_endpoint.permissions
+create vector_search_endpoints.my_endpoint.permissions
-Plan: 1 to add, 1 to change, 1 to delete, 0 unchanged
+Plan: 2 to add, 0 to change, 1 to delete, 0 unchanged

I confirmed the cloud behavior end-to-end against dogfood-aws:

Resource After delete: GET permissions returns
Jobs 200 with full ACL data (incl. IS_OWNER)
Pipelines 200 with full ACL data
Vector search endpoints 404 "not found"
Experiments 404 "does not exist"

There is a known inconsistency in how the cloud permissions API handles
deletion across resource types: V2 resources (vector search, experiments)
cascade-delete ACLs immediately and return 404 on subsequent GETs, while
V1 resources (jobs, pipelines) retain ACL data after the parent is deleted
via async/soft delete. The testserver previously matched neither
behavior — it returned an empty ACL for any unknown object id. The new
default is V2; V1 resources keep their existing fall-through.

When more V2 resources gain coverage that exercises this path, they
should add a case to permissionsParentExists.

Tests

  • go test ./acceptance -run 'TestAccept/bundle/resources/vector_search_endpoints' — green.
  • go test ./acceptance -run 'TestAccept/bundle/resources/permissions' — green.
  • go test ./libs/testserver/... — green.
  • ./task lint and ./task fmt — clean.

This PR was written by Claude Code.

The recreated_same_name drift test asserted "update permissions" because
the testserver returned an empty ACL for any object_id with no entry. The
real cloud returns 404 for V2 permissions resources (verified for
vector_search_endpoints and experiments), so the planner sees
`remoteState == nil` and emits "create" instead. V1 resources (jobs,
pipelines) retain ACL data after delete via async/soft delete, which the
existing "empty ACL on miss" branch approximates well enough.

Make GetPermissions check parent existence by default; for now only
vector-search-endpoints has a parent lookup wired up. Update the test
assertion and output to expect "create".

Co-authored-by: Isaac
@janniklasrose janniklasrose merged commit 2156629 into main May 6, 2026
24 of 25 checks passed
@janniklasrose janniklasrose deleted the janniklasrose/vsendpoint-integ branch May 6, 2026 10:48
janniklasrose added a commit that referenced this pull request May 6, 2026
## Summary

- Reverts #5127 (`Persist endpoint UUID for vector_search_endpoints
drift detection`) and the follow-up changelog entry from #5192.
- The badness #5127 was meant to fix — bundle silently rebinding
permissions to a different backing endpoint after an out-of-band
recreate — was actually addressed by the testserver fix in #5186
(`testserver: 404 on permissions GET when V2 parent is gone`). With the
testserver matching real V2 cloud behavior, bundle correctly observes
that the new endpoint has no permissions and creates them, with no
permanent drift afterwards. UUID persistence in state is no longer
necessary.
- Reworks the `drift/recreated_same_name` acceptance test: keeps
endpoint permissions in `databricks.yml`, drops the obsolete "recreate
detected" assertion, and adds a post-deploy `bundle plan` to confirm
there is no permanent drift.

## Test plan

- [x] `./task build` clean.
- [x] `go test ./acceptance -run
'TestAccept/bundle/resources/vector_search_endpoints/drift'` — all green
(terraform + direct).
- [x] `go test ./bundle/direct/dresources/...` — green.
- [x] `./task lint-q` — clean.
- [x] Verified post-deploy plan shows `Plan: 0 to add, 0 to change, 0 to
delete, 2 unchanged` after an out-of-band endpoint recreate, so
permissions don't end up in permanent drift even without UUID-based
recreate detection.

This pull request and its description were written by Isaac.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants