Skip to content

[AINode]: Integrate toto as a builtin forecasting model#17322

Draft
graceli02 wants to merge 8 commits intoapache:masterfrom
graceli02:toto-ainode-integration
Draft

[AINode]: Integrate toto as a builtin forecasting model#17322
graceli02 wants to merge 8 commits intoapache:masterfrom
graceli02:toto-ainode-integration

Conversation

@graceli02
Copy link

@graceli02 graceli02 commented Mar 20, 2026

Summary

This PR integrates Toto (Time Series Optimized Transformer for Observability) by Datadog as a built-in forecasting model in AINode, following the same pattern as existing built-in HuggingFace models (sundial, chronos2, moirai2, timer_xl).

Toto is installed as an optional dependency via pip install toto-ts and loaded lazily at runtime, so it does not affect AINode's startup time or core dependencies when not in use.

Changes

  • toto/configuration_toto.pyTotoConfig(PretrainedConfig) defining Toto's architecture parameters
  • toto/modeling_toto.pyTotoForPrediction wrapper around toto-ts's Toto class, bridging ModelHubMixin-based loading with AINode's load_model_from_transformers mechanism
  • toto/pipeline_toto.pyTotoPipeline(ForecastPipeline) implementing preprocess / forecast / postprocess with lazy imports guarded by a clear install message
  • toto/__init__.py — package init with Apache 2.0 license header
  • model_info.py — registered toto in BUILTIN_HF_TRANSFORMERS_MODEL_MAP with repo_id="Datadog/Toto-Open-Base-1.0" and auto_map pointing to the above classes
  • AINodeTestUtils.java — added toto to BUILTIN_LTSM_MAP with expected state "active"

Design Notes

Toto uses huggingface_hub.ModelHubMixin rather than transformers.PreTrainedModel. TotoForPrediction.from_pretrained() delegates directly to Toto.from_pretrained() from the toto-ts package, then exposes a .backbone property (TotoBackbone) consumed by TotoForecaster in the pipeline. This avoids wrapping Toto in a standard HuggingFace PreTrainedModel while remaining fully compatible with AINode's existing loading infrastructure.

Setup Required

The Cluster IT - 1C1D1A / AINode test verifies that all built-in models report state "active" in SHOW MODELS. Since toto is newly added, its weights (~605 MB) must be pre-downloaded to the runner cache before the test will pass reliably.

Please run the following once on the self-hosted GPU runner before merging:

from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="Datadog/Toto-Open-Base-1.0", filename="model.safetensors")
hf_hub_download(repo_id="Datadog/Toto-Open-Base-1.0", filename="config.json")

No code changes are needed.


This PR has:

  • been self-reviewed.
    • concurrent read
    • concurrent write
    • concurrent read and write
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods.
  • added or updated version, license, or notice information
  • added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

@graceli02 graceli02 changed the title feat(ainode): Integrate toto as a builtin forecast model [AINode]: Integrate toto as a builtin forecast model Mar 20, 2026
@graceli02 graceli02 changed the title [AINode]: Integrate toto as a builtin forecast model [AINode]: Integrate toto as a builtin forecasting model Mar 20, 2026
@graceli02 graceli02 marked this pull request as draft March 20, 2026 21:47
Integrate Datadog's Toto-Open-Base-1.0 into AINode's builtin model
registry as an optional lazy dependency.

- Add TotoConfig (PretrainedConfig) with Toto architecture params
- Add TotoForPrediction wrapper loaded via ModelHubMixin.from_pretrained
- Add TotoPipeline (ForecastPipeline) with lazy toto-ts import and
  clear installation instructions if the package is missing
- Register 'toto' in BUILTIN_HF_TRANSFORMERS_MODEL_MAP
- Add 'toto' entry to AINodeTestUtils.BUILTIN_LTSM_MAP

toto-ts is optional: no changes to pyproject.toml or poetry.lock
@graceli02 graceli02 force-pushed the toto-ainode-integration branch from 17d4276 to 7740dae Compare March 21, 2026 07:21
- Add Apache 2.0 license header to __init__.py and pipeline_toto.py
- Fix pipeline_toto.py: replace broken local import with lazy toto-ts
  import via _import_toto() helper; fix merge conflict in model_info.py
Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Grace, this is your first PR (pull request) for Apache IoTDB repository, our community highly appreciate your contribution!

Next, let us talk about this PR. For timeseries foundation models' integration in AINode, we generally introduce their source codes then declare their open source license in the LICENSE file, which you can find in the project root dir.

Although installing the released package then invoke corresponding forecaster seems more convenient, this is usually not feasible in system engineering projects. To elaborate, different python projects often employ diverse versions of specified dependencies, resulting in package conflict. For instance, the transformers project updates its kvcache component significantly from v4.4x to v4.5x, meaning models build upon v4.4x cannot share the same dependency with models built on v4.5x.

To improve this PR, I suggest:

  1. To trace the entire process of model forecast, dive into the forecast example scripts of Toto-1.0 model.
  2. Integrate the Toto model through introducing the source code.
  3. Packaging AINode, then verifying your integration via SHOW MODELS and SELECT * FROM FORECAST.

In addition to this process, you might encounter some problems in the model loading phase. This is because the Toto-1.0 model is loaded through ModelHubMixin interface, while the current version of AINode only accepts the PreTrainedModel format. Feel free to this challenge, we are integrating the PyTorchModelHubMixin interface and the corresponding PR can be refered soon.

graceli02 and others added 5 commits March 22, 2026 20:17
…/pipeline

- Fix build_binary.py: poetry lock → poetry install --no-root; remove
  capture_output=True so errors are visible in CI
- Vendor toto source (DataDog/toto, Apache-2.0) into model/toto/:
    model/{attention,backbone,distribution,embedding,feed_forward,
           fusion,rope,scaler,transformer,toto,util}.py
    data/util/dataset.py
    inference/forecaster.py
  Eliminates toto-ts pip dependency and all gluonts transitive deps.
  gluonts replaced with pure PyTorch (TransformedDistribution/AffineTransform,
  torch.nn.Module Scaler base).
- Rewrite modeling_toto.py: TotoForPrediction now inherits PreTrainedModel
  (required by model_loader); backbone stored as self.model so safetensors
  keys (model.*) map directly; custom from_pretrained applies
  _map_state_dict_keys for SwiGLU remapping before loading weights.
- Rewrite pipeline_toto.py: import directly from local source;
  TotoForecaster created lazily inside _get_forecaster() — not at __init__
  time — fixing ImportError at pipeline construction in CI.
- pyproject.toml: add rotary-embedding-torch>=0.8.0 (only new dep)
- .gitignore: un-ignore toto data/ package (Python source, not data files)
- Add toto/NOTICE with Datadog attribution per Apache policy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Apache RAT flagged the standalone NOTICE file inside the toto Python
package because the project's RAT config does not exclude plain NOTICE
files.  Moved the Datadog/toto attribution to the standard location
(project root NOTICE) and removed the inner NOTICE file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n_kwargs

- scaler.py: add short-name aliases ("per_variate", "per_variate_causal",
  "per_variate_causal_patch") to scaler_types dict so that config.json
  string values work without KeyError at backbone init time.
- backbone.py: recognise "per_variate_causal_patch" string in the
  CausalPatchStdMeanScaler branch (alongside the legacy class-path string).
- configuration_toto.py: add output_distribution_kwargs parameter with
  default {"k_components": 5} matching Datadog/Toto-Open-Base-1.0.
- modeling_toto.py: pass output_distribution_kwargs from config to
  TotoBackbone so MixtureOfStudentTsOutput receives k_components.

Fixes: KeyError 'per_variate_causal' in scaler_types lookup.
Fixes: MixtureOfStudentTsOutput missing required positional arg k_components.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants