Skip to content

WIP - test(search): entity search-index edge-case canary suite#28865

Open
pmbrull wants to merge 19 commits into
mainfrom
pmbrull/entity-index-edge-case-tests
Open

WIP - test(search): entity search-index edge-case canary suite#28865
pmbrull wants to merge 19 commits into
mainfrom
pmbrull/entity-index-edge-case-tests

Conversation

@pmbrull

@pmbrull pmbrull commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

A registry-driven integration-test suite that asserts every schema-valid entity shape can be indexed and queried in Elasticsearch/OpenSearch. Any shape the JSON schema allows but the index refuses or silently degrades fails red — the suite is a living punch-list of indexing gaps. Test-only; no production code.

A gap is tolerated only by an explicit opt-in in AcceptedLimits (per entityType × dimension × rung + the tolerated Outcome + a reason). Not listed ⇒ must round-trip (OK) or it's red.

How it works

  • One path (ShapeCanary): build an in-memory entity → real buildSearchIndexDoc() → PUT to a per-case shadow index cloned from the entity's real mapping (dropped after; shared indices never touched) → query back → return a ShapeResult.
  • Outcomes are deliberately coarse and honest:
    • OK — indexed + retrievable + (probed) searchable.
    • DEGRADED_UNSEARCHABLE — indexed and in _source, but the value was dropped from the term index (e.g. a keyword field's ignore_above) → not findable by exact match.
    • REJECTED — the index refused the document (the PUT failed, nothing indexed). The cause is not classified by us; the raw engine error (size / total_fields / nested_objects / depth / parse / anything else) is carried verbatim in ShapeResult.detail() and printed in the test failure.
  • Data-driven: 6 shared ShapeMutations × 20 EntityShapeProfiles → EntityShapeRegistry. Add an entity = one profile file + one registry line. Driver: EntityShapeIT; EntityShapeBaselineIT is the zero-stress control proving a minimal entity round-trips.

⚠️ This PR is intentionally RED

EntityShapeIT fails on every schema-valid shape that doesn't round-trip. The gaps cluster in three dimensions (the per-entity failure prints the raw engine error):

Gap (dimension / rung) Outcome Real bug?
customProperties.breadth / 2k REJECTED (total_fields) YES — affects entities that declare an extension schema property but whose *_index_mapping.json lacks "extension":{"type":"flattened"} (e.g. glossaryTerm, metric, chart, glossary). 2k distinct keys become 2k dynamic fields and exceed total_fields.limit (1000). The fix is the missing mapping block.
owners.count / 12k REJECTED (nested_objects) Inherent ES nested_objects.limit (10000) — accept, or raise the limit. Universal across entities.
keyword.overIgnoreAbove / 300chars DEGRADED_UNSEARCHABLE Intentional ignore_above:256 on displayName.keyword — accept. Universal across entities.

Everything else round-trips fine on self-hosted ES (16MB description, 100k columns, depth 25, 50k tags/charts/fields/tasks/mlFeatures all OK — 100MB content limit + flattening).

Coverage

20 indexed entity types: table, container, dashboard, topic, glossaryTerm, query, storedProcedure, metric, dashboardDataModel, pipeline, mlmodel, searchIndex, apiEndpoint, database, databaseSchema, chart, dataProduct, domain, glossary, apiCollection. Entity-specific unbounded collections (columns / tasks / mlFeatures / fields / schemaFields) get count ladders; the rest get the shared-dimension sweep.

Test Plan

  • EntityShapeIT is red by design — gap cases fail until accepted (AcceptedLimits) or fixed (that IS the deliverable)
  • EntityShapeBaselineIT (minimal entity round-trips) green
  • No production code changed (all under src/test/java)
  • To reach green: opt-in the inherent ES limits in AcceptedLimits, and add the missing flattened extension mapping for the affected entities

🤖 Generated with Claude Code

pmbrull and others added 9 commits June 9, 2026 12:12
…ShapeClassifier

Adds the single-path ShapeCanary helper (build real search-index doc from an
in-memory POJO, PUT to the live ES/OS test container, query back, classify the
outcome) plus the Outcome vocabulary, FieldProbe, and the ShapeClassifier that
maps engine error messages to outcomes. EntityShapeSpikeIT de-risks the path: a
bare in-memory Table builds, PUTs, and retrieves (Outcome.OK). ShapeClassifier
is unit-tested for the size/fields/nested/depth/parse/other buckets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Core types: Rung, ShapeContext, ShapeMutation, EntityShapeProfile, PlannedCase,
and the EntityCases builder for entity-specific ladders. Six shared mutations
(description.size, tags.count, owners.count, followers.count,
customProperties.breadth, keyword.overIgnoreAbove) each declare a ladder plus
predicted per-engine outcomes. All six drive setters confirmed present on
EntityInterface (setDescription/setTags/setOwners/setFollowers/setExtension/
setDisplayName).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EntityShapeRegistry flattens profiles x shared mutations into PlannedCases (the
profiles list is intentionally empty for now; entity profiles land in a later
unit). EntityShapeSweepIT is the @ParameterizedTest driver (assert mode by
default, -Dshape.record=true logs observed outcomes for discovery).
EntityShapeLineMapIT + LineMapWriter render the registry's predictions to a
committed line-map. Sweep/line-map are not run yet (zero cases until profiles
exist).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First entity profile for the search-indexing edge-case canary. Adds
TableShapeProfile (minimal id/name/fqn/one-column Table + columns.count
and column.depth ladders) and registers it in EntityShapeRegistry, so the
6 shared mutations run against a real entity for the first time.

Discovery run (Elasticsearch 9.3) reconciled three predictions to the
observed line — each is a finding about where the real boundary sits, not
a test fix:
- description.size 16MB: predicted REJECT_SIZE -> observed OK (16MB body
  indexes fine; no 413/size cap hit at this rung).
- columns.count 100k: predicted REJECT_SIZE -> observed OK (100k flat
  columns neither trip total_fields nor a doc-size limit).
- column.depth 25: predicted REJECT_DEPTH -> observed OK (nested STRUCT
  columns do not trip mapping depth [20]; the table column field is not
  mapped in a way that enforces the depth limit).

Already-correct predictions confirmed: owners.count 12k -> REJECT_NESTED
(nested_objects.limit 10000 enforced), keyword.overIgnoreAbove 300chars ->
DEGRADED_UNSEARCHABLE (displayName.keyword carries ignore_above:256), and
all remaining 15 cases OK. Assert run green (20/20). Line-map written to
.context/entity-index-line-map.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…yTerm/query/storedProcedure) + line-map

Adds entity shape profiles for Container (dataModelColumns), Dashboard
(charts), Topic (schemaFields), GlossaryTerm (synonyms), Query (queryText)
and StoredProcedure (code), registered in EntityShapeRegistry alongside the
existing Table profile. Predictions reconciled to the discovery run on
Elasticsearch 9.3.

Findings reconciled to observed:
- container/dataModelColumns.count/50k: REJECT_SIZE -> OK
- topic/schemaFields.count/50k: REJECT_SIZE -> OK
- query/queryText.size/16MB: REJECT_SIZE -> OK
- storedProcedure/code.size/16MB: REJECT_SIZE -> OK
  (ES content limit is 100MB and OM flattens collections, so the entity
   size/field rungs do not trip at these magnitudes.)

Cross-entity divergence on the shared customProperties.breadth dimension:
glossaryTerm/2k -> REJECT_FIELDS while all 6 other entities -> OK. Root cause:
the glossary_term index mapping has no explicit "extension" field, so distinct
custom-property keys are dynamically mapped and exceed the total_fields limit;
table et al. map extension as "flattened". Scoped CustomPropertiesBreadthMutation
to exclude GlossaryTerm and codified the real per-entity outcome as an explicit
glossaryTerm ladder (100 -> OK, 2k -> REJECT_FIELDS).

Universal shared findings hold across all 7 entities: owners.count/12k ->
REJECT_NESTED (nested_objects.limit 10000), keyword.overIgnoreAbove/300chars ->
DEGRADED_UNSEARCHABLE (displayName.keyword ignore_above:256).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… + re-discovery

Each canary case now provisions a fresh index cloned from the entity's real
mapping (ShadowIndex) and drops it afterward, instead of PUTting into the
shared real <entity>_search_index. The shared index accumulated dynamic
mapping fields from customProperties.breadth that doc-delete cleanup could not
remove, polluting cross-IT state and making outcomes order-dependent.

The ShapeMutation.expected signature gains an entityType param so per-entity
outcomes are expressible directly; the GlossaryTerm appliesTo carve-out and
the duplicated customProperties ladder in GlossaryTermShapeProfile are removed.

Re-discovery against isolated indices is deterministic: only glossaryTerm
rejects customProperties.breadth/2k (REJECT_FIELDS); query (and every other
entity) indexes OK at 2k. The prior query/glossaryTerm divergence was a
shared-index pollution artifact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…olish

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 9, 2026 12:18

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

❌ PR checklist incomplete

This PR cannot be merged until the following are addressed on its linked issue:

  • No GitHub issue is linked. Link an issue in the Development section of the PR (or add Fixes #12345 to the description). For a same-org cross-repo issue, add Fixes open-metadata/<repo>#123 to the description.

The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically.

Maintainers can bypass this check by adding the skip-pr-checks label.

@github-actions github-actions Bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Jun 9, 2026
@pmbrull pmbrull changed the title test(search): entity search-index edge-case canary suite WIP - test(search): entity search-index edge-case canary suite Jun 9, 2026
Copilot AI review requested due to automatic review settings June 9, 2026 13:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

🟡 Playwright Results — all passed (12 flaky)

✅ 4272 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 88 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 300 0 1 4
🟡 Shard 2 802 0 4 9
🟡 Shard 3 807 0 1 8
🟡 Shard 4 842 0 1 12
🟡 Shard 5 720 0 1 47
🟡 Shard 6 801 0 4 8
🟡 12 flaky test(s) (passed on retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/AdvancedSearch.spec.ts › Verify Rule functionality for field Database Schema with OR operator (shard 2, 1 retry)
  • Features/ColumnBulkOperations.spec.ts › should filter by metadata status and verify API param (shard 2, 1 retry)
  • Features/ColumnBulkOperations.spec.ts › should restore filters from URL on page load (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should start term as Draft when glossary has reviewers (shard 2, 2 retries)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Pages/CustomProperties.spec.ts › Sql Query (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel_KnowledgeCenter.spec.ts › Should remove user owner for knowledgeCenter (shard 5, 1 retry)
  • Pages/Glossary.spec.ts › Change glossary term hierarchy using menu options across glossary (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/PlatformLineage.spec.ts › Verify domain platform view (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

…e error

The specific REJECT_SIZE/FIELDS/NESTED/DEPTH/PARSE buckets were inferred by string-matching
the engine error message — fragile and can mis-attribute (a PUT can fail for reasons that don't
match the patterns). Replace them with a single REJECTED ('the index refused the doc') and carry
the raw engine error verbatim in ShapeResult.detail(), surfaced in the test failure message.
Removes ShapeClassifier + ShapeClassifierTest (no more message bucketing).
@github-actions

Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot AI review requested due to automatic review settings June 10, 2026 08:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

REJECTED now captures the full exception cause-chain (not just getMessage(), which can
be terse or hide the root ES reason in getCause()) and LOG.warn()s the throwable with stack.
ERROR_OTHER (PUT returned but get-by-id finds nothing) now includes the index _count — which
disambiguates a silent no-op PUT (count 0) from a doc written elsewhere (count > 0) — plus the
raw get response, and is logged. verify() returns a ShapeResult so DEGRADED/ERROR_OTHER carry
detail too.
… (review)

Addresses PR review: _count reads the near-real-time search view, so without a refresh a
just-written doc could read 0 and the 'nothing written' hint would be wrong. Force a _refresh
first (get-by-id is realtime; _count is not) so the count is ground truth before we infer
no-op-vs-written-elsewhere.
Copilot AI review requested due to automatic review settings June 10, 2026 08:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@gitar-bot

gitar-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown
Code Review ✅ Approved 4 resolved / 4 findings

Adds a registry-driven integration test suite for entity indexing edge cases, resolving issues with shadow index mapping overrides, innerSource fallbacks, and diagnostic timing. No open issues remain.

✅ 4 resolved
Quality: Shadow index silently drops custom index.mapping.* limits

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/search/shape/ShadowIndex.java:88-102
buildCreateBody reconstructs the shadow index settings from scratch, copying only max_ngram_diff, analysis, and forcing number_of_replicas:0. It does NOT copy any index.mapping.* overrides (e.g. total_fields.limit, nested_objects.limit, depth.limit) from the source index settings. The entire validity of this canary rests on the shadow index applying the SAME limits as production. Today this is fine because no *_index_mapping.json overrides those limits (defaults 1000/10000 apply), but if a future entity's mapping raises total_fields.limit, the shadow clone would silently fall back to the default and report REJECT_FIELDS where production would index fine — a false regression with no warning. Consider copying the whole settings.index.mapping subtree (or at minimum asserting it is absent) so the clone cannot silently diverge from the real index limits.

Edge Case: innerSource fallback throws NoSuchElementException on empty body

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/search/shape/ShadowIndex.java:78-86
When the GET /realIndex response does not contain the expected key, innerSource falls back to response.elements().next(). If the response body is an empty object/array (no elements), this throws a bare NoSuchElementException with no context, which is harder to triage than a descriptive failure. In practice a missing index returns HTTP 404 (caught earlier by SearchClient.execute), so this path is rarely hit, but guarding it with hasNext() and throwing a message that includes realIndex would make any future mapping-response shape change diagnosable.

Edge Case: AcceptedLimits has no engine dimension; OpenSearch CI may fail red

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/search/shape/AcceptedLimits.java:16-30 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/EntityShapeSweepIT.java:60-73
The recent commit removes the engine-aware expectation model (Function<SearchType, Outcome>) and replaces it with a strict rule: every case must produce Outcome.OK unless listed in AcceptedLimits. However AcceptedLimits.Accepted is keyed only by (entityType, dimension, rung) — there is no SearchType/engine field — and AcceptedLimits.find(...) matches on those three alone.

The nightly workflow runs the integration suite under BOTH engines (-DsearchType=elasticsearch and -DsearchType=opensearch, .github/workflows/java-playwright-nightly.yml). The PR's own summary states that AWS-managed/OpenSearch's 10MB http.max_content_length would reject the >=16MB cases (queryText.size 16MB, code.size 16MB, description size 16MB) that index fine on self-hosted ES. Under the old model these differences were expressible per engine; now they cannot be.

Consequences: (1) Under OpenSearch, the 16MB cases would observe REJECT_SIZE, not be in the (currently empty) AcceptedLimits, and fail() — turning the canary red in CI. (2) If you add those cases to AcceptedLimits to keep OpenSearch green, you simultaneously suppress them on Elasticsearch (where they should be OK), losing the regression signal there.

Suggested fix: add an optional SearchType engine to AcceptedLimits.Accepted (null/Optional = applies to all engines) and thread the active engine into AcceptedLimits.find(...) so acceptances can be scoped per engine, restoring the engine differentiation that was removed. Given this is WIP/test-only the severity is bounded, but it should be resolved before the suite is enabled in the dual-engine nightly.

Edge Case: _count diagnostic can mislead due to non-realtime search

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/search/shape/ShapeCanary.java:87-95
notRetrievableDetail uses GET /<index>/_count to distinguish "nothing written (silent no-op)" from "written elsewhere". However, _count runs against the search view, which in ES/OpenSearch is only near-real-time and reflects writes after a refresh. A GET-by-id (/_doc/<id>) is realtime (served from the translog), but _count is not. So immediately after a successful PUT, _count can legitimately return 0 even though the doc was written, producing the misleading message "nothing written — silent no-op PUT". Since this is test failure diagnostics, the impact is limited to confusing debugging output, not test correctness. Consider issuing a ?refresh=true on the PUT path (or a _refresh before _count) so the count reflects reality, or soften the wording to note the count may lag without a refresh.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants