bench | timeseries: JSON API, Grafana datasource plugin, ci-bench-timeseries profile by Russoul · Pull Request #6562 · IntersectMBO/cardano-node

Russoul · 2026-05-06T20:24:39Z

Summary

Query language improvements

epoch parser bug fixed — epoch now correctly produces Timestamp 0 (Unix epoch); previously it parsed identically to now, making epoch + Nms evaluate to ~year 2082 instead of the intended absolute timestamp
Metric-name awareness in the elaborator — unknown variable names now produce Undefined name: <x> errors at elaboration time instead of confusing downstream type-mismatch errors; St carries availableMetrics :: Set MetricIdentifier populated from the store
"Did you mean?" suggestions in Undefined-name errors — the elaborator computes Levenshtein distance (via edit-distance) to known metric names and local bindings, appending the closest match to the error message
Local name shadowing — let x = 1 in let x = 2 in x now correctly returns 2; previously the elaborator rejected re-use of any name
RangeVector/Scalar arithmetic — +, -, *, / between a range vector and a scalar are now supported in both elaboration and interpretation
Unit and List types — Unit, Nil, and Cons constructors added to Value and the semantic Expr; metrics expression returns the metric list as a Cons-list of Text values
sum_over_time elab bug fixed — Surface.SumOverTime was incorrectly elaborated to Semantic.AvgOverTime (copy-paste error)
Noncanonical arithmetic rules — four new bidirectional rules let the elaborator infer types in previously-stuck expressions:
- Duration + Duration : ? → result is Duration (e.g. 1s + 2s now elaborates standalone)
- ? - Duration : ? → lhs must be Timestamp (e.g. \x -> x - 1s infers x : Timestamp)
- ? + ? : Duration → both args must be Duration (e.g. \x -> \y -> m[now; now : x + y] infers both lambda args)
- ? - ? : Timestamp → lhs must be Timestamp, rhs must be Duration (e.g. \x -> \y -> m[x - y; x] infers arg types from the range start position)

Test suite

Comprehensive interpretation tests — 250 tests covering scalar arithmetic, boolean logic, timestamp/duration arithmetic, instant-vector operations, range-vector aggregations (avg_over_time, sum_over_time, rate, increase, quantile_over_time), label filtering, metrics query, shadowing, and error cases
Parser test suite — full coverage of the surface expression parser
Elaboration test suite — ~115 cases covering all elaborator paths: literals, duration literals, arithmetic (well-typed and ill-typed), comparisons, boolean operators, let/lambda, pairs/projections, to_scalar, abs/round, the metrics builtin, range vectors, instant-vector operations, and metric-name resolution (including "Did you mean?" suggestions)

Grafana datasource plugin (`bench/grafana-datasource/`)

$__from / $__to support — the plugin pre-processes queries before template substitution, rewriting $__from and $__to to epoch + Nms expressions the query language understands; dashboard panels now use [$__from; $__to] ranges driven by the Grafana time picker
Prometheus-compatible query API — POST /timeseries/query now accepts { "query": "...", "time": <unix_s> } and returns { "status": "success", "data": <Value> } mirroring Prometheus conventions; errors return { "status": "error", "errorType": "...", "error": "..." }
Error handling — server-side errors are surfaced as Grafana DataQueryError with the first line as the panel message and the full elaboration/interpretation trace in the inspector
Provisioned RTView dashboard — 25 panels covering Resources, Blockchain, Leadership, and Transactions sections, all pre-wired to the datasource
Metric name sanitisation — . and - in EKG metric names are replaced with _ so names match the cardano_node_metrics_* convention used in queries

`ci-bench-timeseries` workbench profile

Genesis funds fixed — the original funds_balance (10M ADA) was insufficient for the tx-generator's fund-splitting phase; a new fundsTimeseries vocabulary sets 22.5B ADA; tx_count reduced to 108,000 (2-hour run at 15 tps) so the required split fits comfortably within available funds
Profile definition — 2-node loopback cluster, Conway era, tracer.timeseries = true, tracer.rtview = true, no shutdown condition

…eseries profile **cardano-tracer / cardano-timeseries-io** - Add `Cardano.Timeseries.JSON` with `ToJSON` instances for `Value`, `Instant`, `Timeseries` and `SeriesIdentifier` (orphan module, imported for side-effects) - Switch `POST /timeseries/query` response from plain text to `application/json` **cardano-profile** - Add `timeseries :: Bool` field to `Tracer` - Add `tracerTimeseries` primitive - Add `ci-bench-timeseries` profile: 2-node local cluster with timeseries endpoint enabled, no shutdown condition, generator runs for 100 000 epochs (effectively indefinite) — intended for interactive exploration via Grafana - Regenerate `all-profiles-coay.json` and `wb_profiles.mk`; update all test fixtures **Nix** - `cardano-tracer-service-workbench.nix`: add `timeseries` option → `hasTimeseries` - `cardano-tracer-service.nix`: add `timeseriesEnable/Host/Port` NixOS options - `tracer.nix`: wire `profile.tracer.timeseries` → `{epHost, epPort = 3400}` - `supervisor.sh`: replace hanging `netstat -pltn` with `lsof -nP -iTCP:9001 -sTCP:LISTEN` for macOS compatibility **bench/grafana-datasource** (new) - TypeScript Grafana datasource plugin (`iog-cardanotimeseries-datasource`) - Sends `POST /timeseries/query` via Grafana server-side proxy (avoids CORS) - Converts `Value` tagged-union JSON to Grafana `DataFrame[]` - `docker-compose.yaml` for local development; datasource auto-provisioned at `http://host.docker.internal:3400`; Colima-compatible (`extra_hosts`)

…heus exposition

- Parse/eval errors from the server are now surfaced as DataQueryError: the banner shows the first line (summary); the full multi-line message (source location + caret) is in data.message for the inspector - Requests go via Grafana's server-side proxy (instanceSettings.url) to avoid CORS and host-resolution issues in the browser - Add Unit value: renders as an empty frame (no data to display)

- POST /timeseries/query now accepts JSON body {"query": "...", "time": <optional Unix seconds>} - Success response wrapped in {"status":"success","data":...} envelope - Error responses use {"status":"error","errorType":"...","error":"..."} format - errorType mirrors Prometheus: "parse", "bad_data", "execution" - Plugin updated to send application/json body and unwrap response envelope - Remove ConfigEditor.tsx (Grafana's built-in URL field suffices)

Introduces Nil and Cons constructors to both the expression language and the value domain. The primary motivation is to give the built-in Metrics expression a well-typed return value: it now returns a proper linked list of Text values (metric names) rather than a right-nested Pair tuple terminated by Unit, making it possible to assign it the type List Text in the elaborator. - Interp: Metrics now builds a Nil/Cons spine instead of a Pair/Unit tuple - JSON: ToJSON instances for Nil and Cons (tag/head/tail encoding) - Show: show Nil = "[]", show (Cons h t) = "[h|t]" - Resolve: structural pass-throughs for Nil and Cons - Grafana plugin: Nil/Cons variants in the TypeScript Value union; Cons case walks the spine into a string table frame, Text items render their actual string value

…ew in ci-bench-timeseries - Add provisioned Grafana dashboard mimicking RTView's four sections: Resources, Blockchain, Leadership, Transactions (26 panels, row-based collapsible sections, byte units for memory/mempool panels) - Pin datasource uid in provisioning so dashboard references are stable - Mount dashboards directory in docker-compose - Enable RTView (hasRTView, port 3300) in ci-bench-timeseries profile alongside the existing timeseries endpoint

… inconsistencies Implements pointwise arithmetic (add, sub, mul, div) between RangeVector Scalar and Scalar, following the existing InstantVector/Scalar pattern. This enables queries like 'metric[now - 1h; now] / 1048576' in Grafana panels. Fixes ten inconsistencies found in typing.txt: missing List type in grammar, filter_by_label argument order, A->B type formation rule conclusion, := vs ≔ notation, 'Syntax hugar' typo, map variable name mismatch, RangeVector missing type parameter in quantile_over_time, four missing instant_vector_scalar relation rules, mul_instant_vector_scalar missing arguments, and stale metrics return type (Text -> List Text). Adds a precondition note to 'interp' that the input expression must be well-typed.

…mar docs Add a cardano-timeseries-test suite (tasty/tasty-hunit) mirroring the cardano-recon-framework layout, with three suites: - Elab.Expr.Parser.Suite: comprehensive parser coverage across all constructs, operator precedence checks, and wrong-arity error cases (152 tests total) - Elab.Suite: well-typed / ill-typed elaboration checks - Interp.Suite: end-to-end execution via API.execute against an empty store Fix parser bug: the `let` RHS was parsed as `exprOr` instead of `exprUniverse`, preventing unparenthesised lambdas and nested lets from appearing as let-RHS. The `in` keyword acts as a natural terminator so no ambiguity arises. Update grammar documentation: - Elab/Expr.hs inline grammar: `let x = t{> universe}` -> `t{≥ universe}` - docs/elab.txt: fix `<int>min` -> `<int>m`; add missing `round t`, `earliest x`, `latest x`

… elab bug The test suite covers scalar/bool arithmetic and comparisons, duration and timestamp literals, type conversions (to_scalar, abs, round), pairs, let/lambda, InstantVector lookup/aggregation/filter/map/label-filter/unless/join, InstantVector-Scalar arithmetic and relations, RangeVector aggregations (avg_over_time, sum_over_time, rate, increase, quantile_over_time), and metrics. Writing the tests surfaced a bug in Elab.hs where Surface.SumOverTime was elaborated to Semantic.AvgOverTime instead of Semantic.SumOverTime, causing sum_over_time queries to silently return the mean. Fixed.

Drop the checkFresh guard so duplicate names are no longer rejected. Change variable lookup to search the context right-to-left (Seq.reverse) so that the innermost binding wins, as scoping requires.

St now carries availableMetrics :: Set MetricIdentifier, populated at the call site via metrics store (or Set.empty in metric-free contexts like the elab test suite). The fallback Variable case — previously an unconditional metric assumption — now checks membership in availableMetrics and throws "Undefined name: <v>" when the name appears neither in the local context nor in the store, replacing the confusing downstream type-mismatch error that occurred before.

…tamp 0 The `epoch` keyword was mapped to `Now` in the parser, making it identical to `now` (i.e. current server time). It now produces `Epoch`, which the interpreter evaluates to `Timestamp 0` (Unix epoch), so expressions like `epoch + 1778499300385ms` yield the intended absolute timestamp rather than ~year 2082.

The ci-bench-timeseries profile has ~900M transactions (100 000 epochs × ~9k txs/epoch) but inherited fundsDefault (10 000 ADA) from base, which covers only a handful of transactions before the generator exits with "insufficient funds". Add fundsTimeseries (25 000 000 ADA = 25 × 10^15 lovelace) to Vocabulary and a baseTimeseries variant of base that uses it. Switch ciTimeseries02Value to baseTimeseries so the ci-bench-timeseries profile has enough genesis funds to sustain a long-running workload.

100 000 epochs needed ~23B ADA in the generator wallet (split grows exponentially with tx_count via unfoldSplitSequence) but genesis could only supply 22.5B, causing an immediate "insufficient funds" crash. 12 epochs at 15 tps / timescaleCompressed = ~108 000 transactions = 2 hours. The required split is now ~2.3M ADA, well within the 22.5B available.

When a variable is not found in either the local context or the metric store, the elaborator now appends a "Did you mean: ..." hint by ranking all candidate names (locals + metrics) by Levenshtein distance and showing the closest ones (up to 5) within threshold max(1, len/3). Uses the `edit-distance` library (the canonical Haskell implementation, also used by GHC for its own "did you mean" diagnostics).

Replace hardcoded [now - 1h; now] in all 25 dashboard panels with [$__from; $__to] so the Grafana time picker controls the window. Pre-process $__from/$__to in the datasource plugin before getTemplateSrv() sees them — Grafana treats these as built-in variables and ignores scopedVars overrides — converting them to epoch + Nms expressions that the query language interprets as absolute timestamps.

Replaces the 7-test skeleton with a comprehensive suite covering all elaborator code paths: literals, duration literals, arithmetic (well-typed and ill-typed), comparisons, boolean operators, let/lambda, pairs/projections, to_scalar, abs/round, metrics builtin, range vectors, instant-vector operations, and metric-name resolution (known metric, undefined name, Levenshtein suggestions). Notes two elaborator bidirectionality constraints discovered via test failures: - `Duration + Duration` requires the result type to be driven by a containing expression (tested via `now + (1s + 2s)`). - `\x -> x + 1` leaves `x`'s type unconstrained; replaced with `\x -> now + x` which forces `x : Duration` via the Timestamp+? noncanonical rule.

When both operands are known to be Duration but the result type is still a hole, force the hole to Duration and re-queue as the canonical Duration + Duration : Duration problem. This allows standalone expressions like `1s + 2s` to elaborate without needing an outer context to drive the result type.

All three rules fire only when the known type information uniquely determines the outcome: ? - Duration : ? -> Timestamp - Duration : Timestamp Only Timestamp - Duration exists with Duration on the Sub rhs. Enables: \x -> x - 1s (infer x : Timestamp) ? + ? : Duration -> Duration + Duration : Duration Only Duration + Duration produces Duration. Enables: \x -> \y -> m [now; now : x + y] (infer both : Duration) ? - ? : Timestamp -> Timestamp - Duration : Timestamp Only Timestamp - Duration produces Timestamp via Sub. Enables: \x -> \y -> m [x - y; x] (infer x : Timestamp, y : Duration) Ordering: A (? - Duration) before C (? - ? : Timestamp) so the more specific rhsTy=Duration match takes priority. B (? + ? : Duration) after the existing Duration + Duration : ? rule so the known-both-sides case is tried first.

Russoul added 20 commits May 7, 2026 00:16

cardano-tracer | timeseries: Sanitise metric name to match our Promet…

8c469ef

…heus exposition

Add Unit type to timeseries

3a912d2

timeseries: support local name shadowing in the elaborator

29aaab2

Drop the checkFresh guard so duplicate names are no longer rejected. Change variable lookup to search the context right-to-left (Seq.reverse) so that the innermost binding wins, as scoping requires.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench | timeseries: JSON API, Grafana datasource plugin, ci-bench-timeseries profile#6562

bench | timeseries: JSON API, Grafana datasource plugin, ci-bench-timeseries profile#6562
Russoul wants to merge 20 commits into
masterfrom
russoul/timeseries-align-http-api-with-prometheus

Russoul commented May 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Russoul commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Query language improvements

Test suite

Grafana datasource plugin (bench/grafana-datasource/)

ci-bench-timeseries workbench profile

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Russoul commented May 6, 2026 •

edited

Loading

Grafana datasource plugin (`bench/grafana-datasource/`)

`ci-bench-timeseries` workbench profile