Skip to content

[SPARK-55964] Cache coherence: clear function registry on DROP DATABASE#54781

Closed
srielau wants to merge 6 commits intoapache:masterfrom
srielau:SPARK-55982-cache-coherence
Closed

[SPARK-55964] Cache coherence: clear function registry on DROP DATABASE#54781
srielau wants to merge 6 commits intoapache:masterfrom
srielau:SPARK-55982-cache-coherence

Conversation

@srielau
Copy link
Contributor

@srielau srielau commented Mar 13, 2026

  • Add dropFunctionsInDatabase(db) to FunctionRegistryBase and implementations
  • SessionCatalog.dropDatabase clears scalar and table function registry cache for the dropped database so resolution does not see stale entries
  • Add SessionCatalogSuite tests for cache coherence

What changes were proposed in this pull request?

We now delete functions from a a schema dropped within the session from the sesssion function registry
design-cache-coherence.md

Why are the changes needed?

Without this change the session coudl keep resolving fucntions from a schema it had dropped.

Does this PR introduce any user-facing change?

It fixes a bug that can be observed.

How was this patch tested?

Added new testcases

Was this patch authored or co-authored using generative AI tooling?

Claude Opus4.6

- Add dropFunctionsInDatabase(db) to FunctionRegistryBase and implementations
- SessionCatalog.dropDatabase clears scalar and table function registry cache
  for the dropped database so resolution does not see stale entries
- Add SessionCatalogSuite tests for cache coherence
Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Prior state and problem: When a database is dropped via SessionCatalog.dropDatabase, the session-level function registries (functionRegistry and tableFunctionRegistry) can retain stale entries for functions that belonged to the dropped database. Subsequent function resolution could then resolve these phantom functions.

Design approach: Add a dropFunctionsInDatabase method to FunctionRegistryBase, with a scan-and-remove implementation in SimpleFunctionRegistryBase and a no-op in EmptyFunctionRegistryBase. Call both registries from SessionCatalog.dropDatabase.

Implementation: Changes span FunctionRegistry.scala (trait + two implementations) and SessionCatalog.scala (5-line addition in dropDatabase). Two new tests in SessionCatalogSuite verify the clearing for scalar and table function registries.

srielau and others added 2 commits March 16, 2026 18:02
…ysis/FunctionRegistry.scala

Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
@cloud-fan
Copy link
Contributor

The test failure seems to be real

invalidateCachedTable(QualifiedTableName(SESSION_CATALOG_NAME, dbName, t.table))
}
// Clear cached functions in this database so the cache stays coherent on drop.
// normalizeFuncName stores entries with catalog=None, so the filter must match that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug I've fixed in srielau#3 , we can update the code here later.

@cloud-fan
Copy link
Contributor

SQL tests all passed, thanks, merging to master!

@cloud-fan cloud-fan closed this in 6e05916 Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants