feat(throttle transform): Multi-threshold rate limiting with dropped output port by szibis · Pull Request #24702 · vectordotdev/vector

szibis · 2026-02-20T15:43:49Z

Summary

Add multi-dimensional rate limiting to the throttle transform with independent thresholds for event count, estimated JSON byte size, and custom VRL token expressions. Events are dropped when any configured threshold is exceeded.

Target release: 0.54.0

threshold.events — maximum events per window (backward compat with threshold: N)
threshold.json_bytes — estimated JSON byte size via EstimatedJsonEncodedSizeOf (zero serialization overhead)
threshold.tokens — VRL expression evaluated per event for custom cost (e.g. strlen(string!(.message)))
reroute_dropped — routes throttled events to a named .dropped output port for dead-letter routing
Per-key per-threshold observability metrics — opt-in via internal_metrics.emit_detailed_metrics with bounded-cardinality defaults

Motivation

The current throttle transform only rate limits by event count. This falls short in real-world scenarios:

Loki sink users hit per-stream 3 MB byte rate limits, causing 429 cascades that event-count throttling cannot prevent
Cloud logging services (Datadog, CloudWatch, BigQuery) charge by ingested bytes, not events — a 100-byte log and a 100 KB log cost very differently
Edge/IoT deployments need bandwidth-aware throttling where network capacity is the constraint, not event rate
Multi-tenant platforms need per-service throttle visibility to understand which tenants consume quota and why

Architecture

%%{init: {'theme': 'base', 'themeVariables': { 'lineColor': '#000000', 'primaryTextColor': '#000000'}}}%%
flowchart LR
    subgraph INPUT
        E[Event]
    end

    subgraph THROTTLE["Throttle Transform"]
        direction TB
        EX{Exclude?}
        EL[Events Limiter<br/>GCRA]
        BL[Bytes Limiter<br/>GCRA]
        TL[Tokens Limiter<br/>VRL + GCRA]
        CHK{Any exceeded?}
    end

    subgraph OUTPUT
        P[Primary Output]
        D[Dropped Output<br/>reroute_dropped]
    end

    E --> EX
    EX -->|excluded| P
    EX -->|check| EL
    EL --> BL
    BL --> TL
    TL --> CHK
    CHK -->|all pass| P
    CHK -->|any exceeded| D

    style EX fill:#ffffff,stroke:#000000,stroke-width:2px
    style EL fill:#e8e8e8,stroke:#000000,stroke-width:2px
    style BL fill:#e8e8e8,stroke:#000000,stroke-width:2px
    style TL fill:#cccccc,stroke:#000000,stroke-width:3px
    style CHK fill:#ffffff,stroke:#000000,stroke-width:2px
    style P fill:#ffffff,stroke:#000000,stroke-width:2px
    style D fill:#dddddd,stroke:#000000,stroke-width:2px

Full backward compatibility

The threshold field uses #[serde(untagged)] enum deserialization to accept both the old integer syntax and the new object syntax:

#[serde(untagged)]
pub enum ThresholdConfig {
    Simple(u32),                    // threshold: 100
    Multi(MultiThresholdConfig),    // threshold: { events: 100, json_bytes: 500000 }
}

This means:

Aspect	Old config	New config	Behavior
Config syntax	`threshold: 100`	`threshold: { events: 100 }`	Both parse correctly
Rate limiting	1 GCRA limiter (events)	Same path when only events configured	Identical
Metrics emitted	`component_discarded_events_total`	Same + `throttle_threshold_discarded_total` (bounded, 1 series for events-only)	Additive only
Performance	TaskTransform baseline	SyncTransform + hot-path optimizations (214µs)	25% faster (Criterion-verified)
Memory	Governor DashMap	Same DashMap, same entries	No change
`emit_events_discarded_per_key`	Existing opt-in	Still works identically	No change
New fields	N/A	`json_bytes`, `tokens`, `reroute_dropped`, `emit_detailed_metrics`	All default to off/absent

Zero migration needed. Every existing throttle config continues to work without changes. The only observable difference is:

Slightly faster processing (SyncTransform vs TaskTransform)
One new always-on metric (throttle_threshold_discarded_total{threshold_type="events"}) which is bounded to 1 series for existing configs

All new features are purely additive and default to disabled.

Combined event rate + byte throughput limiting

A single threshold block can enforce both an event rate cap and a byte throughput cap simultaneously. Each type runs its own independent GCRA limiter. An event is dropped the moment any limiter is exceeded:

transforms:
  rate_limit:
    type: throttle
    inputs: ["source"]
    window_secs: 60
    key_field: "{{ service }}"
    threshold:
      events: 5000          # event rate limit
      json_bytes: 3000000   # byte throughput limit

This covers two distinct failure modes in a single transform:

Event rate cap — prevents log flooding (e.g., tight loop logging 100K events/sec of tiny messages that individually pass byte limits)
Byte throughput cap — prevents volume spikes (e.g., 500 events/sec but each carrying 100KB stack traces that individually pass event count limits)

Neither threshold alone is sufficient. A service could bypass a byte-only limit by sending millions of tiny events, or bypass an event-only limit by sending few massive events. With both enforced, both attack vectors are covered.

A third dimension — a custom VRL token cost — can be added in the same definition:

    threshold:
      events: 5000
      json_bytes: 3000000
      tokens: 'strlen(string!(.message))'   # custom cost function

All three in one definition, one transform, one key_field — three independent limiters checked per event. Performance overhead for events+bytes combined is +71% vs events-only baseline, still processing ~2.80M events/sec.

Configuration examples

Old syntax (still works)

[transforms.simple]
type = "throttle"
inputs = ["source"]
threshold = 100
window_secs = 60

Multi-threshold with per-tenant keys

transforms:
  per_tenant:
    type: throttle
    inputs: ["source"]
    window_secs: 60
    key_field: "{{ service }}"
    threshold:
      events: 1000
      json_bytes: 500000
      tokens: 'strlen(string!(.message))'
    exclude: '.level == "error"'
    reroute_dropped: true
    internal_metrics:
      emit_detailed_metrics: true

Dropped output port (dead-letter routing)

transforms:
  rate_limit:
    type: throttle
    inputs: ["source"]
    threshold:
      events: 500
    reroute_dropped: true

sinks:
  dead_letter:
    type: file
    inputs: ["rate_limit.dropped"]
    path: "/var/log/vector/throttled/%Y-%m-%d.log"
    encoding:
      codec: json

Key design decisions

SyncTransform (not TaskTransform)

The original throttle uses TaskTransform (async Stream). This PR rewrites to SyncTransform because:

Enables multi-output ports via TransformOutputsBuf (required for reroute_dropped)
Eliminates async state machine overhead (measurable in benchmarks)
Pattern matches other multi-output transforms in the codebase (e.g., remap with dropped port)
DynClone requirement solved via ThrottleSyncTransform wrapper with lazy state initialization

Separate GCRA limiter per threshold type

Each threshold type gets its own independent governor RateLimiter. An event is dropped when any limiter is exceeded. Governor's check_key_n() consumes N tokens atomically, so byte-cost and token-cost events interact correctly with the GCRA algorithm.

EstimatedJsonEncodedSizeOf reuse

For json_bytes, we reuse Vector's existing EstimatedJsonEncodedSizeOf trait (already implemented for Event, LogEvent, all Value types with quickcheck tests). Zero allocation, zero serialization — just arithmetic over the in-memory value tree.

Deferred key string allocation

key_str is only materialized (to_owned()) when a metric actually needs to be emitted. On the happy path (events pass through, no metrics enabled), no String allocation occurs per event. This produced a measurable 5-7% throughput improvement.

Metrics: three-tier cardinality control

Tier 1: Always emitted (bounded cardinality — max 4 series total)

These are emitted regardless of configuration. Safe for any deployment.

Metric	Type	Tags	Max series
`component_discarded_events_total`	Counter	`component_id`, `intentional=true`	1
`throttle_threshold_discarded_total`	Counter	`threshold_type` (events\|json_bytes\|tokens)	3

Tier 2: Legacy opt-in (`emit_events_discarded_per_key: true`)

Backward compatible with existing behavior. Cardinality = O(unique keys).

Metric	Type	Tags	Cardinality
`events_discarded_total`	Counter	`key`	O(keys)

Tier 3: New detailed metrics (`emit_detailed_metrics: true`)

Full per-key per-threshold observability. Cardinality = O(keys × threshold_types).

Metric	Type	Tags	Description
`throttle_events_discarded_total`	Counter	`key`, `threshold_type`	Drops per key per threshold
`throttle_events_processed_total`	Counter	`key`	Total events per key (passed + dropped)
`throttle_bytes_processed_total`	Counter	`key`	Cumulative JSON bytes per key
`throttle_tokens_processed_total`	Counter	`key`	Cumulative VRL token cost per key
`throttle_utilization_ratio`	Gauge	`key`, `threshold_type`	Current usage/threshold ratio (alert at 0.8)

Metrics impact by configuration

Configuration	Metrics emitted	Series at 100 keys × 3 thresholds	Throughput impact
Both flags `false` (default)	`component_discarded_events_total` + `throttle_threshold_discarded_total{threshold_type}`	4	0% (baseline)
`emit_events_discarded_per_key: true` only	Above + `events_discarded_total{key}`	104	+1.0%
`emit_detailed_metrics: true` only	Tier 1 + all Tier 3 metrics	~804	+75%
Both flags `true`	All tiers combined	~904	+77%

Both flags default to false, so out-of-the-box the transform emits only 4 bounded-cardinality metric series with zero overhead, regardless of how many unique keys exist.

Performance impact

All benchmarks: Criterion, 200 samples, 30s measurement, 5s warmup, 100K resamples, 1024 events/iteration.

A. Throughput by threshold type (no metrics)

Benchmark	Time (µs)	Throughput	vs events baseline	Analysis
`events_only/under_limit`	214	4.78M/s	—	Backward-compat baseline. Existing `threshold: N` configs take this path. 25% faster than initial SyncTransform (hot-path optimizations: inlined threshold checks, deferred VRL eval, sampled gauge emission).
`json_bytes_only`	326	3.14M/s	+52%	`EstimatedJsonEncodedSizeOf` per event (~109ns/event). No allocation, just arithmetic over in-memory value tree.
`events_and_bytes`	366	2.80M/s	+71%	Two governor calls + byte estimation. Additive.
`vrl_tokens`	550	1.86M/s	+157%	VRL `Runtime::resolve()` dominates (~328ns/event). Expected for interpreted eval.
`all_three_thresholds`	596	1.72M/s	+178%	Maximum config: 3 limiters + bytes + VRL. Still >1.7M events/sec.
`events_only/over_limit`	397	2.58M/s	+85%	Discard path: governor rejection + `component_discarded_events_total` (mandatory) + debug log.
`with_dropped_port`	414	2.48M/s	+93%	Over-limit + routing to `.dropped` output.
`high_cardinality_keys` (100)	403	2.54M/s	+88%	Template rendering for `key_field`.

B. Metrics overhead (100 keys, events-only threshold)

Benchmark	Time (µs)	vs metrics-off	What it measures
`metrics_both_off`	363	—	Baseline: 100 keys, no per-key metrics.
`metrics_legacy_only`	367	+1.0%	Just `events_discarded_total{key}`. Negligible.
`metrics_detailed_only`	636	+75%	Full Tier 3: 3 counters + 3 gauges per event.
`metrics_both_on`	643	+77%	Both tiers. Nearly same as detailed-only.
`metrics_detailed_high_cardinality` (10K keys)	779	+115%	10K unique keys × detailed metrics.
`metrics_detailed_all_thresholds`	964	+166%	Maximum: detailed + 3 threshold types. Worst case.

C. Key cardinality scaling (no metrics)

Threshold config	10 keys	100 keys	1000 keys	10→1000 factor
`events_only`	355µs	365µs	444µs	1.25×
`events+bytes`	405µs	424µs	589µs	1.45×
`all_three`	646µs	681µs	922µs	1.43×

Scaling is sublinear — 100× more keys only causes 1.25-1.45× slowdown (DashMap O(1) amortized lookup).

D. Memory footprint per key

Config	Theoretical per-key	10K keys total
events_only (1 limiter)	~104 bytes	~1.0 MB
events+bytes (2 limiters)	~208 bytes	~2.0 MB
all_three (3 limiters)	~312 bytes	~3.0 MB
all_three + detailed_metrics	~448 bytes	~4.4 MB

Even 10K tenants × 3 thresholds + detailed metrics uses under 5 MB — negligible vs Vector's baseline RSS (50-100 MB).

Impact assessment

What existing users get (zero config changes needed)

Aspect	Before	After
Config syntax	`threshold: 100`	Still works identically
Processing path	TaskTransform (async)	SyncTransform (sync, 25% faster)
Metrics emitted	`component_discarded_events_total`	Same + `throttle_threshold_discarded_total` (bounded, max 3 series)
Performance	Baseline	No regression (Criterion: "Performance has improved")
Memory	Baseline	No change

What new users can opt into

Feature	Overhead	Use case
`threshold.json_bytes`	+52% throughput	Loki byte limits, cloud logging cost control
`threshold.tokens`	+157% throughput	Custom cost functions (message length, field-based pricing)
`reroute_dropped`	+3% on drop path	Dead-letter routing, replay, audit
`emit_detailed_metrics`	+75% throughput	Per-tenant dashboard, utilization alerting

All new features are additive and opt-in. No existing behavior changes.

Real-world usage scenarios

Each feature below is opt-in and independent. Pick only what you need — overhead is only paid for features you enable.

`threshold.json_bytes` — byte-aware rate limiting (+52% overhead)

Problem: Event-count throttling treats a 50-byte healthcheck and a 100 KB stack trace identically. Downstream services don't.

Scenario 1: Loki per-stream byte rate limits

Loki enforces a default per_stream_rate_limit of 3 MB/s. When a service emits a burst of large log events, event-count throttling won't prevent 429 rejections because 100 events of 100 KB each = 10 MB, far exceeding the 3 MB limit — even if you set threshold: 100.

transforms:
  loki_guard:
    type: throttle
    inputs: ["app_logs"]
    window_secs: 1
    key_field: "{{ stream }}"
    threshold:
      json_bytes: 3000000   # Match Loki's 3 MB/stream/sec limit

This catches the burst before it reaches Loki, avoiding 429 cascades and the retry storms that follow.

Why +52% is worth it: The overhead comes from EstimatedJsonEncodedSizeOf — a fast recursive walk over the in-memory event value tree (~109ns/event, no serialization, no allocation). The percentage is relative to the highly-optimized events-only baseline (214µs for 1024 events); in absolute terms, json_bytes still processes 3.14M events/sec. For any pipeline where downstream charges by bytes or enforces byte limits, this prevents far more expensive outcomes: 429 retry storms, Loki stream lockouts, or unexpected cloud billing spikes.

Scenario 2: Edge/IoT bandwidth-aware throttling

On edge devices with limited uplink (e.g., 1 Mbps satellite link), you need to throttle by actual payload size, not event count. A heartbeat event (50 bytes) and a firmware diagnostic dump (500 KB) should not be treated equally:

transforms:
  bandwidth_guard:
    type: throttle
    inputs: ["edge_telemetry"]
    window_secs: 10
    key_field: "{{ device_id }}"
    threshold:
      json_bytes: 125000   # ~100 Kbps sustained per device

`reroute_dropped` — dead-letter routing (+3% on drop path only)

Problem: Today, throttled events are silently discarded. You have no way to replay them, audit what was lost, or route them to cheaper storage.

When reroute_dropped: true is set, throttled events are sent to a named .dropped output port instead of being discarded. The +3% overhead only applies to events that are actually being dropped (the happy path — events passing through — has zero overhead from this flag).

Scenario 1: Dead-letter queue for replay

Route throttled events to a file or S3 sink for later replay during off-peak hours:

transforms:
  rate_limit:
    type: throttle
    inputs: ["source"]
    window_secs: 60
    key_field: "{{ service }}"
    threshold:
      events: 1000
      json_bytes: 3000000
    reroute_dropped: true

sinks:
  primary:
    type: loki
    inputs: ["rate_limit"]
    # ... normal Loki config

  replay_queue:
    type: aws_s3
    inputs: ["rate_limit.dropped"]
    bucket: "my-dead-letter-bucket"
    key_prefix: "throttled/{{ service }}/%Y-%m-%d/"
    encoding:
      codec: json

During off-peak windows, replay from S3 back through Vector to recover dropped data — zero data loss, guaranteed byte-rate compliance.

Scenario 2: Audit trail for compliance

In regulated environments, you may need to prove what was dropped and why. Route dropped events to a local file with metadata:

sinks:
  audit_trail:
    type: file
    inputs: ["rate_limit.dropped"]
    path: "/var/log/vector/audit/throttled/%Y-%m-%d.jsonl"
    encoding:
      codec: json

Every dropped event is preserved verbatim (byte-identical to the input — the throttle transform never modifies events). Compliance teams can verify exactly what was rate-limited.

Scenario 3: Overflow to cheaper storage tier

Route excess traffic to a cheaper destination instead of dropping entirely:

sinks:
  primary:
    type: elasticsearch
    inputs: ["rate_limit"]        # Premium: fast, indexed, searchable
    # ... expensive cluster

  overflow:
    type: aws_s3
    inputs: ["rate_limit.dropped"] # Budget: cold storage, query on demand
    bucket: "overflow-logs"
    encoding:
      codec: json
      compression: gzip

Why +3% is negligible: The overhead only fires on the drop path, and it's just routing an event to a second output buffer. Compared to the value of not losing data, this is effectively free.

`emit_detailed_metrics` — per-tenant observability (+75% overhead)

Problem: Without per-key metrics, you know that something is being throttled but not which tenant, which threshold, or how close other tenants are to their limits.

When emit_detailed_metrics: true is set, the transform emits:

throttle_events_discarded_total{key, threshold_type} — which tenant hit which limit
throttle_events_processed_total{key} — total events per tenant (passed + dropped)
throttle_bytes_processed_total{key} — total byte volume per tenant
throttle_tokens_processed_total{key} — total custom token cost per tenant
throttle_utilization_ratio{key, threshold_type} — current usage / threshold (0.0-1.0+)

Scenario 1: Per-tenant dashboard for multi-tenant SaaS

You run a multi-tenant platform where each service has a logging quota. With detailed metrics piped to Prometheus + Grafana:

transforms:
  tenant_throttle:
    type: throttle
    inputs: ["all_services"]
    window_secs: 60
    key_field: "{{ service }}"
    threshold:
      events: 5000
      json_bytes: 10000000
    internal_metrics:
      emit_detailed_metrics: true

sources:
  vector_metrics:
    type: internal_metrics

sinks:
  prometheus:
    type: prometheus_exporter
    inputs: ["vector_metrics"]

Now you can build Grafana panels showing per-service:

Event rate vs quota (throttle_events_processed_total / threshold)
Byte volume vs budget (throttle_bytes_processed_total)
Drop rate by threshold type (throttle_events_discarded_total)
Utilization heatmap across all tenants (throttle_utilization_ratio)

Scenario 2: Proactive alerting at 80% utilization

The throttle_utilization_ratio gauge lets you alert before throttling kicks in:

# Alert: tenant approaching 80% of byte limit
throttle_utilization_ratio{threshold_type="json_bytes"} > 0.8

# Alert: any tenant actively being throttled
rate(throttle_events_discarded_total[5m]) > 0

# Dashboard: top 10 tenants by byte consumption
topk(10, throttle_bytes_processed_total)

Operators get advance warning to contact tenants, adjust quotas, or investigate runaway services — instead of finding out after data is already being dropped.

Scenario 3: Cost attribution and chargeback

For platforms that bill tenants for logging usage, throttle_bytes_processed_total{key} provides per-tenant byte volume that maps directly to cloud logging cost:

# Monthly byte ingestion per service (for billing)
increase(throttle_bytes_processed_total[30d])

# Cost estimate at $0.50/GB (e.g., Datadog)
increase(throttle_bytes_processed_total[30d]) / 1e9 * 0.50

Why +75% can be worth it: The overhead comes from updating 3-6 metric counters per event (each with a key tag requiring hash lookups in the metrics registry). This is significant, but:

You only enable this where you need tenant visibility — not on every throttle transform in your pipeline
The alternative is worse — without per-key metrics, diagnosing throttling issues requires log diving, guesswork, or adding a separate log_to_metric + aggregate chain that costs more than 75%
Cardinality is bounded by your key_field — if you have 50 services, that's ~300 metric series (50 × 6 metrics). If you have 10K keys, consider whether you actually need per-key visibility for all of them, or if you can use a higher-level grouping
The base throughput is still 1.6M+ events/sec — even with detailed metrics on, the transform processes events far faster than most sinks can consume them

When NOT to enable `emit_detailed_metrics`

Scenario	Recommendation
`key_field` produces <500 unique values	Safe to enable — bounded cardinality, manageable series count
`key_field` produces 500-10K values	Enable with monitoring — watch Prometheus scrape times and memory
`key_field` produces >10K values or is unbounded (e.g., user IDs)	Don't enable — use the always-on `throttle_threshold_discarded_total{threshold_type}` for aggregate visibility instead
No `key_field` configured	Low value — only one key ("None"), so detailed metrics add 6 series. Marginal benefit over Tier 1 metrics

Combining features: the cost is additive, not multiplicative

Each feature adds its overhead independently. Here's the combined cost for common configurations:

Configuration	Total overhead	Typical use case
`threshold: 100` (existing)	-25% (faster)	Legacy configs — free upgrade
`json_bytes` only	+52%	Loki/cloud byte limits without tenant visibility
`json_bytes` + `reroute_dropped`	+52% normal / +55% drop path	Byte limits + dead-letter routing
`json_bytes` + `reroute_dropped` + `emit_detailed_metrics`	+75-90%	Full stack: byte limits, dead-letter, per-tenant dashboards
`events` + `json_bytes` + `tokens` (all thresholds, no metrics)	+178%	Maximum rate limiting, no observability overhead

The typical production config — json_bytes with reroute_dropped — adds ~52-55% relative overhead vs the optimized events-only baseline, while still processing 2.80M+ events/sec and solving real problems that event-count throttling cannot address.

How did you test this PR?

Unit Tests (22 tests)

cargo test -p vector --lib --features transforms-throttle -- transforms::throttle

Tests cover: backward compat, all threshold types, dropped port routing, key independence, exclude condition bypass, VRL expression errors (defaults to cost 1), data integrity (events unmodified through throttle), completeness (no events lost/duplicated), metrics emission with correct tags, utilization tracking across windows, key cardinality scaling (10/100/1K keys), memory footprint measurement.

Integration Tests (7 tests)

cargo test --test integration --features throttle-integration-tests -- throttle

Real vector binary via assert_cmd: config validation, stdin→stdout event flow, dropped port routing to separate output, multi-threshold with key_field, backward compat simple threshold, exclude bypasses limit, data integrity verification.

Benchmarks (23 benchmarks)

cargo bench --bench transform --features transform-benches -- throttle

Three groups: throughput (8), metrics overhead (6), key cardinality scaling (9). Criterion with 200 samples, 30s measurement, statistical significance testing.

E2E Tests

cargo vdev e2e test throttle-transform

Docker Compose with 3 config variants (events-only, bytes, multi-threshold).

Static Analysis

cargo clippy -p vector --features transforms-throttle -- -D warnings  # Clean

Change Type

Bug fix
New feature
Non-functional (chore, refactoring, docs)
Performance

Is this a breaking change?

Yes
No

The legacy threshold: <number> syntax is fully preserved. The only observable change for existing configs is:

~25% throughput improvement (SyncTransform + hot-path optimizations vs TaskTransform)
One new bounded-cardinality metric (throttle_threshold_discarded_total)

Does this PR include user facing changes?

Yes. Changelog fragment included at changelog.d/11854_throttle_multi_threshold.feature.md.
No.

References

Issue/PR	Relationship
#11854	Closes — Original feature request for byte-based throttling
#14280	Builds on — Earlier attempt at byte throttling (this PR uses better architecture)

Closes #11854

…output port Add multi-dimensional rate limiting to the throttle transform with independent thresholds for event count, estimated JSON byte size, and custom VRL token expressions. Events are dropped when any configured threshold is exceeded. New capabilities: - `threshold.events` — maximum events per window (backward compat with `threshold: N`) - `threshold.json_bytes` — estimated JSON byte size via EstimatedJsonEncodedSizeOf - `threshold.tokens` — VRL expression for custom cost (e.g. `strlen(string!(.message))`) - `reroute_dropped` — routes throttled events to a named `.dropped` output port - Per-key per-threshold observability metrics (opt-in via `emit_detailed_metrics`) The legacy `threshold: <number>` syntax remains fully backward compatible. Closes vectordotdev#11854

OliviaShoup · 2026-02-20T23:04:17Z

hey @szibis thank you for the PR! i've made an editorial review card for a docs team member to take a look: https://datadoghq.atlassian.net/browse/DOCS-13474

urseberry

Left non-blocking suggestions to replace "e.g." with "for example" per the Datadog public documentation guidelines.

website/cue/reference/components/transforms/throttle.cue

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Add overflow guards for VRL token cost (i64 → u32) and json_bytes (usize → u32) conversions. Values exceeding u32::MAX are now clamped instead of silently truncating.

When check_thresholds short-circuits (e.g., events limiter denies), subsequent limiters never consume tokens from the governor. Update utilization tracking to only count consumption for limiters that were actually checked, preventing drift between reported utilization and actual governor bucket state. Also clarify that tokens_threshold intentionally uses json_bytes as its budget, with a comment explaining the coupling.

- Warn when event cost exceeds governor burst capacity (check_key_n) - Defer VRL evaluate_tokens until after events limiter passes, avoiding expensive event.clone() on already-rejected events - Sample gauge emissions every 100 events instead of per-event (gauges overwrite so less frequent emission is equivalent) - Bound utilization HashMap to 10K keys to prevent unbounded memory growth from high-cardinality key fields - Reduce String allocations: avoid HashMap key clone when entry exists, allocate key_str once in process() for all metric emissions - Inline threshold checking into process() to enable early exits

szibis requested review from a team as code owners February 20, 2026 15:43

github-actions bot added domain: transforms Anything related to Vector's transform components domain: external docs Anything related to Vector's external, public documentation labels Feb 20, 2026

szibis force-pushed the feat/throttle-multi-threshold branch 2 times, most recently from cadda68 to 7fde598 Compare February 20, 2026 16:10

szibis mentioned this pull request Feb 20, 2026

throttle by bytes #11854

Open

szibis force-pushed the feat/throttle-multi-threshold branch from 7fde598 to 10eeee3 Compare February 20, 2026 16:15

github-actions bot added the domain: ci Anything related to Vector's CI environment label Feb 20, 2026

szibis mentioned this pull request Feb 20, 2026

enhancement(throttle transform): Allow throttling by bytes #14280

Draft

OliviaShoup added the editorial review label Feb 20, 2026

urseberry approved these changes Feb 20, 2026

View reviewed changes

szibis and others added 6 commits February 21, 2026 08:47

Update throttle.cue

9eb0948

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Update throttle.cue

f7fbfe3

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Update throttle.cue

ad9b3fb

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Update throttle.cue

8396daf

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Update throttle.cue

8cabc44

Co-authored-by: Ursula Chen <58821586+urseberry@users.noreply.github.com>

Merge branch 'master' into feat/throttle-multi-threshold

8f67480

szibis force-pushed the feat/throttle-multi-threshold branch from fb8400a to 8f67480 Compare February 23, 2026 21:36

szibis added 6 commits February 23, 2026 22:37

Merge branch 'master' into feat/throttle-multi-threshold

bcbd72e

Merge branch 'master' into feat/throttle-multi-threshold

4e7506b

fix(throttle): safe narrowing casts for i64/usize to u32

d3ed0b9

Add overflow guards for VRL token cost (i64 → u32) and json_bytes (usize → u32) conversions. Values exceeding u32::MAX are now clamped instead of silently truncating.

Merge branch 'master' into feat/throttle-multi-threshold

1a9515e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(throttle transform): Multi-threshold rate limiting with dropped output port#24702

feat(throttle transform): Multi-threshold rate limiting with dropped output port#24702
szibis wants to merge 13 commits intovectordotdev:masterfrom
szibis:feat/throttle-multi-threshold

szibis commented Feb 20, 2026 •

edited

Loading

Uh oh!

OliviaShoup commented Feb 20, 2026

Uh oh!

urseberry left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

szibis commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Architecture

Full backward compatibility

Combined event rate + byte throughput limiting

Configuration examples

Old syntax (still works)

Multi-threshold with per-tenant keys

Dropped output port (dead-letter routing)

Key design decisions

SyncTransform (not TaskTransform)

Separate GCRA limiter per threshold type

EstimatedJsonEncodedSizeOf reuse

Deferred key string allocation

Metrics: three-tier cardinality control

Tier 1: Always emitted (bounded cardinality — max 4 series total)

Tier 2: Legacy opt-in (emit_events_discarded_per_key: true)

Tier 3: New detailed metrics (emit_detailed_metrics: true)

Metrics impact by configuration

Performance impact

A. Throughput by threshold type (no metrics)

B. Metrics overhead (100 keys, events-only threshold)

C. Key cardinality scaling (no metrics)

D. Memory footprint per key

Impact assessment

What existing users get (zero config changes needed)

What new users can opt into

Real-world usage scenarios

threshold.json_bytes — byte-aware rate limiting (+52% overhead)

reroute_dropped — dead-letter routing (+3% on drop path only)

emit_detailed_metrics — per-tenant observability (+75% overhead)

When NOT to enable emit_detailed_metrics

Combining features: the cost is additive, not multiplicative

How did you test this PR?

Unit Tests (22 tests)

Integration Tests (7 tests)

Benchmarks (23 benchmarks)

E2E Tests

Static Analysis

Change Type

Is this a breaking change?

Does this PR include user facing changes?

References

Uh oh!

OliviaShoup commented Feb 20, 2026

Uh oh!

urseberry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

szibis commented Feb 20, 2026 •

edited

Loading

Tier 2: Legacy opt-in (`emit_events_discarded_per_key: true`)

Tier 3: New detailed metrics (`emit_detailed_metrics: true`)

`threshold.json_bytes` — byte-aware rate limiting (+52% overhead)

`reroute_dropped` — dead-letter routing (+3% on drop path only)

`emit_detailed_metrics` — per-tenant observability (+75% overhead)

When NOT to enable `emit_detailed_metrics`