WIP - test(search): entity search-index edge-case canary suite#28865
WIP - test(search): entity search-index edge-case canary suite#28865pmbrull wants to merge 19 commits into
Conversation
…ShapeClassifier Adds the single-path ShapeCanary helper (build real search-index doc from an in-memory POJO, PUT to the live ES/OS test container, query back, classify the outcome) plus the Outcome vocabulary, FieldProbe, and the ShapeClassifier that maps engine error messages to outcomes. EntityShapeSpikeIT de-risks the path: a bare in-memory Table builds, PUTs, and retrieves (Outcome.OK). ShapeClassifier is unit-tested for the size/fields/nested/depth/parse/other buckets. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Core types: Rung, ShapeContext, ShapeMutation, EntityShapeProfile, PlannedCase, and the EntityCases builder for entity-specific ladders. Six shared mutations (description.size, tags.count, owners.count, followers.count, customProperties.breadth, keyword.overIgnoreAbove) each declare a ladder plus predicted per-engine outcomes. All six drive setters confirmed present on EntityInterface (setDescription/setTags/setOwners/setFollowers/setExtension/ setDisplayName). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EntityShapeRegistry flattens profiles x shared mutations into PlannedCases (the profiles list is intentionally empty for now; entity profiles land in a later unit). EntityShapeSweepIT is the @ParameterizedTest driver (assert mode by default, -Dshape.record=true logs observed outcomes for discovery). EntityShapeLineMapIT + LineMapWriter render the registry's predictions to a committed line-map. Sweep/line-map are not run yet (zero cases until profiles exist). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First entity profile for the search-indexing edge-case canary. Adds TableShapeProfile (minimal id/name/fqn/one-column Table + columns.count and column.depth ladders) and registers it in EntityShapeRegistry, so the 6 shared mutations run against a real entity for the first time. Discovery run (Elasticsearch 9.3) reconciled three predictions to the observed line — each is a finding about where the real boundary sits, not a test fix: - description.size 16MB: predicted REJECT_SIZE -> observed OK (16MB body indexes fine; no 413/size cap hit at this rung). - columns.count 100k: predicted REJECT_SIZE -> observed OK (100k flat columns neither trip total_fields nor a doc-size limit). - column.depth 25: predicted REJECT_DEPTH -> observed OK (nested STRUCT columns do not trip mapping depth [20]; the table column field is not mapped in a way that enforces the depth limit). Already-correct predictions confirmed: owners.count 12k -> REJECT_NESTED (nested_objects.limit 10000 enforced), keyword.overIgnoreAbove 300chars -> DEGRADED_UNSEARCHABLE (displayName.keyword carries ignore_above:256), and all remaining 15 cases OK. Assert run green (20/20). Line-map written to .context/entity-index-line-map.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…yTerm/query/storedProcedure) + line-map Adds entity shape profiles for Container (dataModelColumns), Dashboard (charts), Topic (schemaFields), GlossaryTerm (synonyms), Query (queryText) and StoredProcedure (code), registered in EntityShapeRegistry alongside the existing Table profile. Predictions reconciled to the discovery run on Elasticsearch 9.3. Findings reconciled to observed: - container/dataModelColumns.count/50k: REJECT_SIZE -> OK - topic/schemaFields.count/50k: REJECT_SIZE -> OK - query/queryText.size/16MB: REJECT_SIZE -> OK - storedProcedure/code.size/16MB: REJECT_SIZE -> OK (ES content limit is 100MB and OM flattens collections, so the entity size/field rungs do not trip at these magnitudes.) Cross-entity divergence on the shared customProperties.breadth dimension: glossaryTerm/2k -> REJECT_FIELDS while all 6 other entities -> OK. Root cause: the glossary_term index mapping has no explicit "extension" field, so distinct custom-property keys are dynamically mapped and exceed the total_fields limit; table et al. map extension as "flattened". Scoped CustomPropertiesBreadthMutation to exclude GlossaryTerm and codified the real per-entity outcome as an explicit glossaryTerm ladder (100 -> OK, 2k -> REJECT_FIELDS). Universal shared findings hold across all 7 entities: owners.count/12k -> REJECT_NESTED (nested_objects.limit 10000), keyword.overIgnoreAbove/300chars -> DEGRADED_UNSEARCHABLE (displayName.keyword ignore_above:256). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… + re-discovery Each canary case now provisions a fresh index cloned from the entity's real mapping (ShadowIndex) and drops it afterward, instead of PUTting into the shared real <entity>_search_index. The shared index accumulated dynamic mapping fields from customProperties.breadth that doc-delete cleanup could not remove, polluting cross-IT state and making outcomes order-dependent. The ShapeMutation.expected signature gains an entityType param so per-entity outcomes are expressible directly; the GlossaryTerm appliesTo carve-out and the duplicated customProperties ladder in GlossaryTermShapeProfile are removed. Re-discovery against isolated indices is deterministic: only glossaryTerm rejects customProperties.breadth/2k (REJECT_FIELDS); query (and every other entity) indexes OK at 2k. The prior query/glossaryTerm divergence was a shared-index pollution artifact. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…olish Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntional broad catches
❌ PR checklist incompleteThis PR cannot be merged until the following are addressed on its linked issue:
The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically. Maintainers can bypass this check by adding the |
…ntityShapeSweepIT assertions
🟡 Playwright Results — all passed (12 flaky)✅ 4272 passed · ❌ 0 failed · 🟡 12 flaky · ⏭️ 88 skipped
🟡 12 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
…e error
The specific REJECT_SIZE/FIELDS/NESTED/DEPTH/PARSE buckets were inferred by string-matching
the engine error message — fragile and can mis-attribute (a PUT can fail for reasons that don't
match the patterns). Replace them with a single REJECTED ('the index refused the doc') and carry
the raw engine error verbatim in ShapeResult.detail(), surfaced in the test failure message.
Removes ShapeClassifier + ShapeClassifierTest (no more message bucketing).
|
The Java checkstyle failed. Please run You can install the pre-commit hooks with |
REJECTED now captures the full exception cause-chain (not just getMessage(), which can be terse or hide the root ES reason in getCause()) and LOG.warn()s the throwable with stack. ERROR_OTHER (PUT returned but get-by-id finds nothing) now includes the index _count — which disambiguates a silent no-op PUT (count 0) from a doc written elsewhere (count > 0) — plus the raw get response, and is logged. verify() returns a ShapeResult so DEGRADED/ERROR_OTHER carry detail too.
… (review) Addresses PR review: _count reads the near-real-time search view, so without a refresh a just-written doc could read 0 and the 'nothing written' hint would be wrong. Force a _refresh first (get-by-id is realtime; _count is not) so the count is ground truth before we infer no-op-vs-written-elsewhere.
Code Review ✅ Approved 4 resolved / 4 findingsAdds a registry-driven integration test suite for entity indexing edge cases, resolving issues with shadow index mapping overrides, innerSource fallbacks, and diagnostic timing. No open issues remain. ✅ 4 resolved✅ Quality: Shadow index silently drops custom index.mapping.* limits
✅ Edge Case: innerSource fallback throws NoSuchElementException on empty body
✅ Edge Case: AcceptedLimits has no engine dimension; OpenSearch CI may fail red
✅ Edge Case: _count diagnostic can mislead due to non-realtime search
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
Summary
A registry-driven integration-test suite that asserts every schema-valid entity shape can be indexed and queried in Elasticsearch/OpenSearch. Any shape the JSON schema allows but the index refuses or silently degrades fails red — the suite is a living punch-list of indexing gaps. Test-only; no production code.
A gap is tolerated only by an explicit opt-in in
AcceptedLimits(perentityType×dimension×rung+ the toleratedOutcome+ a reason). Not listed ⇒ must round-trip (OK) or it's red.How it works
ShapeCanary): build an in-memory entity → realbuildSearchIndexDoc()→ PUT to a per-case shadow index cloned from the entity's real mapping (dropped after; shared indices never touched) → query back → return aShapeResult.OK— indexed + retrievable + (probed) searchable.DEGRADED_UNSEARCHABLE— indexed and in_source, but the value was dropped from the term index (e.g. a keyword field'signore_above) → not findable by exact match.REJECTED— the index refused the document (the PUT failed, nothing indexed). The cause is not classified by us; the raw engine error (size / total_fields / nested_objects / depth / parse / anything else) is carried verbatim inShapeResult.detail()and printed in the test failure.ShapeMutations × 20EntityShapeProfiles →EntityShapeRegistry. Add an entity = one profile file + one registry line. Driver:EntityShapeIT;EntityShapeBaselineITis the zero-stress control proving a minimal entity round-trips.EntityShapeITfails on every schema-valid shape that doesn't round-trip. The gaps cluster in three dimensions (the per-entity failure prints the raw engine error):customProperties.breadth / 2kREJECTED(total_fields)extensionschema property but whose*_index_mapping.jsonlacks"extension":{"type":"flattened"}(e.g. glossaryTerm, metric, chart, glossary). 2k distinct keys become 2k dynamic fields and exceedtotal_fields.limit(1000). The fix is the missing mapping block.owners.count / 12kREJECTED(nested_objects)nested_objects.limit(10000) — accept, or raise the limit. Universal across entities.keyword.overIgnoreAbove / 300charsDEGRADED_UNSEARCHABLEignore_above:256ondisplayName.keyword— accept. Universal across entities.Everything else round-trips fine on self-hosted ES (16MB description, 100k columns, depth 25, 50k tags/charts/fields/tasks/mlFeatures all OK — 100MB content limit + flattening).
Coverage
20 indexed entity types: table, container, dashboard, topic, glossaryTerm, query, storedProcedure, metric, dashboardDataModel, pipeline, mlmodel, searchIndex, apiEndpoint, database, databaseSchema, chart, dataProduct, domain, glossary, apiCollection. Entity-specific unbounded collections (columns / tasks / mlFeatures / fields / schemaFields) get count ladders; the rest get the shared-dimension sweep.
Test Plan
EntityShapeITis red by design — gap cases fail until accepted (AcceptedLimits) or fixed (that IS the deliverable)EntityShapeBaselineIT(minimal entity round-trips) greensrc/test/java)AcceptedLimits, and add the missingflattenedextension mapping for the affected entities🤖 Generated with Claude Code