Skip to content

Add schema aliases. Use aliases in rename-only deprecations.#6310

Draft
mzient wants to merge 8 commits intoNVIDIA:mainfrom
mzient:schema_aliases
Draft

Add schema aliases. Use aliases in rename-only deprecations.#6310
mzient wants to merge 8 commits intoNVIDIA:mainfrom
mzient:schema_aliases

Conversation

@mzient
Copy link
Copy Markdown
Contributor

@mzient mzient commented Apr 22, 2026

Category:

New feature (non-breaking change which adds functionality)
Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

This change introduces a new mechanism of schema aliases. This mechanism causes the target schema to be selected, removing the need to add parent and risking incomplete schema inheritance. It also makes the code shorter and helps with serialized pipelines - newly serialized pipelines will store the actual operator name, opening the door for future removal of deprecated operators.

Additional information:

Affected modules and functionalities:

Deprecated operators.

Key points relevant for the review:

Tests:

There's no explicit test for aliases. Tests involving deprecated operators will suffice.

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-4666

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR introduces a DALI_SCHEMA_ALIAS macro that cleanly maps deprecated operator names to their canonical successors via a new alias_map alongside the existing schema_map. It replaces repetitive boilerplate AddParent + NumInput/NumOutput declarations across ~10 operator files, and updates OperatorRegistry::Create to transparently resolve aliases when the requested name is absent from the device registry.

All previously identified bugs (stale it after alias resolution, sentinel insertion before cycle detection, non-advancing lookup_name in the factory loop, infinite loop on registered targets) appear to have been addressed in the current revision.

Confidence Score: 5/5

Safe to merge — all previously identified logic bugs have been addressed and no new P0/P1 issues found

All bugs flagged in prior review rounds (stale iterator after alias resolution, dangling sentinel insertion, non-advancing lookup_name in factory loop, infinite loop on registered canonical names) are correctly resolved in this revision. The alias resolution path in TryGetSchema, AddAlias cycle detection, and operator_factory.h Create() loop all terminate correctly across the reachable cases. Remaining observations are P2 style points that do not affect correctness.

No files require special attention

Important Files Changed

Filename Overview
dali/pipeline/operator/op_schema.cc Adds alias registry with correct cycle detection and path-compression; TryGetSchema now transparently resolves aliases by re-searching schema_map after alias lookup
dali/pipeline/operator/op_schema.h Adds AliasFor setter/getter, alias_for_ member, DALI_SCHEMA_ALIAS macro, AddAlias public API, and deletes SchemaRegistry constructor correctly
dali/pipeline/operator/operator_factory.h Create() now resolves aliases via TryGetSchema; loop terminates correctly — exits when it != end(), breaks when schema->name()==lookup_name (no further resolution possible), or when schema is null
include/dali/core/string_map.h New header providing transparent-hash unordered_string_map and string_map aliases; heterogeneous lookup is correctly wired via is_transparent and equal_to<>
dali/pipeline/operator/operator.cc Call-sites updated from pointer-based devName to value-based std::string; lifetime is safe since Create() is synchronous and the local string outlives the call
dali/operators/reader/tfrecord_reader_op.cc Schema replaced with DALI_SCHEMA_ALIAS; no DALI_REGISTER_OPERATOR for the alias name, alias resolution in Create() will walk through to readers__TFRecord
dali/operators/audio/resample.cc experimental__AudioResample switched to DALI_SCHEMA_ALIAS; no DALI_REGISTER_OPERATOR needed since alias resolution handles it; typo in deprecation message also fixed

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["DALI_SCHEMA_ALIAS(AliasName, OpName)"]
    A --> B["DALI_SCHEMA(AliasName) → RegisterSchema(AliasName)\nInserts AliasName into schema_map\nwith name_=AliasName"]
    B --> C[".AliasFor(OpName)\nSets alias_for_=OpName\nCalls SchemaRegistry::AddAlias(AliasName, OpName)"]
    C --> D["alias_map[AliasName] = OpName\n(after cycle detection & path-compression)"]

    E["TryGetSchema(name)"]
    E --> F{"schema_map.find(name)?\nAnd AliasFor empty?"}
    F -->|"Found & not alias"| G["Return that schema ✓"]
    F -->|"Not found OR is alias"| H["alias_map.find(name)"]
    H -->|"Found"| I["Resolve to target\nschema_map.find(target)"]
    I --> G
    H -->|"Not found"| J["Return nullptr"]

    K["OperatorRegistry::Create(name, ...)"]
    K --> L{"registry_.find(name)?"}
    L -->|"Found"| M["Invoke creator ✓"]
    L -->|"Not found"| N["TryGetSchema(lookup_name)"]
    N -->|"Returns target schema\nname != lookup_name"| O["lookup_name = schema->name()\nregistry_.find(new name)"]
    O --> L
    N -->|"Null or name==lookup_name"| P["DALI_ENFORCE fails"]
Loading

Reviews (8): Last reviewed commit: "Add missing include <optional>" | Re-trigger Greptile

Comment thread dali/pipeline/operator/op_schema.cc
Comment thread dali/pipeline/operator/op_schema.cc Outdated
Comment on lines +84 to +98
auto &alias_map = aliases();
auto &actual = alias_map[std::string(alias_name)];
if (!actual.empty())
throw std::invalid_argument(make_string("\"", alias_name,
"\" is already used as a schema alias name for \"", actual, "\""));

for (;;) {
auto redir = alias_map.find(actual_name);
if (redir == alias_map.end())
break;
if (redir->second == alias_name)
throw std::invalid_argument("Cycle detected while adding schema alias.");
actual_name = redir->second;
}
actual = actual_name;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 AddAlias inserts empty sentinel before validating the chain

alias_map[std::string(alias_name)] inserts alias_name with an empty string value before the chain-following loop runs. If the loop later throws (e.g., cycle detected), the map is left with a dangling alias_name → "" entry. A subsequent call to AddAlias(alias_name, ...) would bypass the "already registered" guard (if (!actual.empty())) and silently overwrite the target. Consider inserting into the map only after the loop successfully completes.

Comment on lines +90 to +97
for (;;) {
auto redir = alias_map.find(actual_name);
if (redir == alias_map.end())
break;
if (redir->second == alias_name)
throw std::invalid_argument("Cycle detected while adding schema alias.");
actual_name = redir->second;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Alias path-compression only works for aliases added after their targets

The for (;;) chain-following loop in AddAlias flattens chains at insertion time: AddAlias(A, B) when B → C already exists stores A → C directly. However, if A → B is registered before B → C, then A stays pointing to B and is never updated when B is later aliased. With TryGetSchema fixed to re-look up via schema_map.find(alias_it->second), an alias-to-alias chain would result in nullptr for the intermediate name if that intermediate is itself only an alias (not a real schema entry).

This is not a problem today because DALI_SCHEMA_ALIAS is always used to point to a real schema (not to another alias), but it is worth documenting or asserting that actual_name must be a real schema name, not itself an alias.

Copy link
Copy Markdown
Contributor Author

@mzient mzient Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greptileai Path compression might fail - but will cycle detection still work?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, cycle detection still works correctly regardless of insertion order.

The loop traverses alias_map directly at each step using the current state of the map, so it always follows the full chain:

Consider the failure case for path compression:

  1. AddAlias(A, B) — stored as A → B (B→C doesn't exist yet)
  2. AddAlias(B, C) — stored as B → C
  3. AddAlias(C, A) — cycle attempt

In step 3, the loop does:

  • find(A)A → B, so actual_name = B; check B == C? No
  • find(B)B → C, so actual_name = C; check C == C? Yes → throws

The loop doesn't rely on pre-compressed paths — it re-walks alias_map live each time. So even when path compression is incomplete (due to out-of-order registration), the cycle check is exhaustive and correct.

The only real outstanding issue is that TryGetSchema doesn't re-look up it in schema_map after updating name. That's a separate bug from cycle detection.

mzient added 2 commits April 22, 2026 21:36
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@NVIDIA NVIDIA deleted a comment from dali-automaton Apr 22, 2026
@NVIDIA NVIDIA deleted a comment from dali-automaton Apr 22, 2026
Comment thread dali/pipeline/operator/operator_factory.h Outdated
Comment thread dali/pipeline/operator/operator_factory.h
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Comment thread dali/pipeline/operator/op_schema.cc
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Comment thread dali/pipeline/operator/operator_factory.h Outdated
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Comment thread dali/pipeline/operator/operator_factory.h
…lias insertions.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [49252356]: BUILD STARTED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [49252707]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [49252707]: BUILD FAILED

@mzient mzient marked this pull request as draft April 23, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants