refactor(agent-service): redesign sync-execution result and error model#5927
refactor(agent-service): redesign sync-execution result and error model#5927bobbai00 wants to merge 12 commits into
Conversation
Automated Reviewer SuggestionsBased on the
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5927 +/- ##
============================================
+ Coverage 56.61% 57.14% +0.53%
Complexity 3004 3004
============================================
Files 1124 1121 -3
Lines 43298 43074 -224
Branches 4667 4661 -6
============================================
+ Hits 24511 24613 +102
+ Misses 17359 17033 -326
Partials 1428 1428
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 371 | 0.226 | 26,098/36,839/36,839 us | 🔴 +35.3% / 🔴 +140.3% |
| 🟢 | bs=100 sw=10 sl=64 | 794 | 0.485 | 124,892/146,980/146,980 us | 🟢 -6.6% / 🔴 +32.9% |
| ⚪ | bs=1000 sw=10 sl=64 | 919 | 0.561 | 1,091,299/1,138,112/1,138,112 us | ⚪ within ±5% / 🔴 +7.0% |
Baseline details
Latest main 4d05ab2 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 371 tuples/sec | 455 tuples/sec | 757.16 tuples/sec | -18.5% | -51.0% |
| bs=10 sw=10 sl=64 | MB/s | 0.226 MB/s | 0.278 MB/s | 0.462 MB/s | -18.7% | -51.1% |
| bs=10 sw=10 sl=64 | p50 | 26,098 us | 20,493 us | 12,971 us | +27.4% | +101.2% |
| bs=10 sw=10 sl=64 | p95 | 36,839 us | 27,223 us | 15,333 us | +35.3% | +140.3% |
| bs=10 sw=10 sl=64 | p99 | 36,839 us | 27,223 us | 18,732 us | +35.3% | +96.7% |
| bs=100 sw=10 sl=64 | throughput | 794 tuples/sec | 802 tuples/sec | 957.66 tuples/sec | -1.0% | -17.1% |
| bs=100 sw=10 sl=64 | MB/s | 0.485 MB/s | 0.49 MB/s | 0.585 MB/s | -1.0% | -17.0% |
| bs=100 sw=10 sl=64 | p50 | 124,892 us | 122,887 us | 103,982 us | +1.6% | +20.1% |
| bs=100 sw=10 sl=64 | p95 | 146,980 us | 157,373 us | 110,583 us | -6.6% | +32.9% |
| bs=100 sw=10 sl=64 | p99 | 146,980 us | 157,373 us | 118,369 us | -6.6% | +24.2% |
| bs=1000 sw=10 sl=64 | throughput | 919 tuples/sec | 926 tuples/sec | 979.6 tuples/sec | -0.8% | -6.2% |
| bs=1000 sw=10 sl=64 | MB/s | 0.561 MB/s | 0.565 MB/s | 0.598 MB/s | -0.7% | -6.2% |
| bs=1000 sw=10 sl=64 | p50 | 1,091,299 us | 1,084,586 us | 1,024,553 us | +0.6% | +6.5% |
| bs=1000 sw=10 sl=64 | p95 | 1,138,112 us | 1,117,970 us | 1,063,789 us | +1.8% | +7.0% |
| bs=1000 sw=10 sl=64 | p99 | 1,138,112 us | 1,117,970 us | 1,096,239 us | +1.8% | +3.8% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,539.71,200,128000,371,0.226,26097.63,36838.89,36838.89
1,100,10,64,20,2519.03,2000,1280000,794,0.485,124891.65,146980.48,146980.48
2,1000,10,64,20,21771.76,20000,12800000,919,0.561,1091299.39,1138111.99,1138111.99…#5930) ### What changes were proposed in this PR? Removes the step-checkout feature from agent-service, which was wired end-to-end but unreachable — nothing in the product ever invokes the frontend `checkoutStep()`, so the backend `/agents/:id/checkout` endpoint and its `headChange` WebSocket broadcast were never triggered. Removing the dead vertical slice: - `agent-service/src/server.ts` — drop the `POST /:id/checkout` route and the `headChange` / `workflowContent` members of the outgoing WebSocket-message type. - `agent-service/src/agent/texera-agent.ts` — drop `TexeraAgent.checkout()`. - `frontend/.../agent/agent.service.ts` — drop the `case "headChange"` WS handler and the unused `checkoutStep()` method. HEAD *tracking* is retained: `init`/`step` messages still carry `headId`, and the chat panel still walks the ancestor path from HEAD to render the visible step chain. Only the backward HEAD move (checkout) is removed. This branches from `main` and is independent of the type-refactor stack (apache#5751 → apache#5928/apache#5927). Note apache#5751 currently lists `headChange` as a server message; whichever merges second will drop it from the union — a trivial reconciliation. ### Any related issues, documentation, discussions? Closes apache#5942 Part of apache#5747. Independent of the stack (branches from `main`). ### How was this PR tested? agent-service: `tsc --noEmit` clean, 93/93 `bun test` pass, prettier clean. Frontend: the removed `checkoutStep()` had zero callers and the `headChange` handler is now unreachable; verified no remaining `checkout`/`headChange` references and no orphaned imports. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…module Foundation slice of the agent-service reorganization (no runtime behavior change): - Add src/types/api.ts (wire DTOs) and src/types/metadata.ts (operator metadata types extracted out of api/backend-api.ts); export both from the types barrel. - Add src/config/endpoints.ts exposing getServiceEndpoints(). - Move src/api/auth-api.ts -> src/auth/jwt.ts (content unchanged) and add src/auth/jwt.test.ts; update auth import paths. - Keep api/backend-api.ts transitional: it now imports/re-exports the metadata types from ../types/metadata and retains getBackendConfig/fetchOperatorMetadata, which the follow-up clients PR relocates. The types/agent.ts and types/workflow.ts reshaping is deferred to the PR that also updates server.ts/workflow-state.ts, since those renames are coupled.
Split types/api.ts into types/wire.ts (backend wire DTOs) and types/ws.ts
(this service's own WebSocket frames), and consume them at the real call
sites instead of the duplicate definitions that previously lived inline:
- server.ts imports WsMessage/WsOutgoingMessage/OperatorResultSummaryWs from
types/ws (local copies removed).
- api/workflow-api.ts imports Workflow/WorkflowPersistRequest from types/wire.
- api/compile-api.ts, agent/util/context-utils.ts, and agent/texera-agent.ts
import WorkflowFatalError/WorkflowCompilationResponse from types/wire.
Correct WorkflowFatalError to match the backend proto (workflowruntimestate.proto):
`type` is the FatalErrorType enum name (string), with details/operatorId/
workerId/timestamp optional, replacing the inaccurate `type: { name }` /
all-required shape. Type-level only; consumers read only `.message`, so there
is no runtime behavior change.
tsc --noEmit, prettier --check, and bun test (101 pass / 0 fail) all green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Encode test JWT segments as base64url (not base64) to match real tokens, including a case whose payload contains `-`/`_` to pin url-safe decoding. - Add extractBearerToken coverage: valid bearer, case-insensitive scheme, non-Bearer scheme, missing token, absent header. Addresses the Copilot review comments on src/auth/jwt.test.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pure file rename plus import-path updates; no type or behavior changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hema from types/metadata workflow-system-metadata.ts defined its own copies of these two interfaces, identical to the canonical ones in types/metadata.ts. Import the canonical types and delete the duplicates so the centralized definitions are actually used. Type-level only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…initions Remove all `any` from src/types/*.ts: - opaque/object JSON blobs -> unknown / Record<string, unknown> (jsonSchema, operatorProperties, result rows, sampleRecords, ReActStep tool input/output, physicalPlan, OperatorSchemaInfo/CompactOperatorSchema). - WsOutgoingMessage.workflowContent -> WorkflowContent (it is assigned from WorkflowState.getWorkflowContent()). Type-level only; tsc, prettier, and tests pass with no consumer changes needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o *.spec.ts - Rename all *.test.ts to *.spec.ts (bun discovers .spec by default; no config change). - Add workflow-system-metadata.spec.ts: schema compaction ($ref inlining, key filtering), getAllSchemasAsJson, Ajv property validation, and the formatting helpers. - Add workflow-api.spec.ts and compile-api.spec.ts: persist/retrieve/compile over a mocked fetch, exercising the proto-accurate WorkflowFatalError shape. 126 pass / 0 fail; tsc and prettier green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ReActStep used `unknown` for fields whose shapes we actually know: - inputMessages -> ModelMessage[] (the prepared AI SDK messages) - toolCalls[].input -> Record<string, unknown> (tool args are JSON objects) - toolResults[].output -> string (every tool returns via createToolResult/createErrorResult) The AI SDK types tc.input / tr.output as `unknown` for dynamically registered tools, so narrow at the single construction boundary in texera-agent.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ted unions Replace the flat src/types/ws.ts with a types/ws/ directory: - client.ts: WsClientRequest = WsClientRequestPrompt | WsClientRequestStopCommand (each extending a base; orthogonal fields instead of one all-optional interface) - server.ts: WsServerMessage union (init/step/state/complete/error/headChange), each declaring only the fields it sends; OperatorResultSummaryWs moves here - index.ts: barrel so types/index.ts's `export * from "./ws"` resolves to the dir The client->server wire discriminator changes from "message"/"stop" to "prompt"/"command" (stop becomes commandType:"stop"); server->client `type` values are unchanged. server.ts parses WsClientRequest and switches on the new shapes, and the frontend agent.service.ts WS sends are updated in lockstep so the protocol stays consistent end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ndation PR The execution-result redesign (OperatorExecutionSummary etc.) lives in a follow-up PR; revert the lone any->unknown tweak so this PR leaves types/execution.ts byte-identical to main.
Restructure the per-operator summary the sync-execution backend returns and the
agent-service/frontend consume, for a leaner and consistent wire contract:
- Replace flat OperatorInfo with OperatorExecutionSummary: state, errorMessages,
resultSummary?, consoleLogsSummary? (orthogonal sub-summaries; no shape stats).
- Rename SyncExecutionResult -> WorkflowExecutionSummary; drop compilationErrors
(folded into errors). errors and per-op errorMessages are non-optional (empty
means none).
- OperatorResultSummary.sampleTuples is now List[SampleRow] ({rowIndex, tuple})
instead of a JSON array with an embedded __row_index__. Drop the table-shape
types (TableShape/InputPortTableShape): the agent derives input-port shapes
from the DAG + each upstream's output shape; output shape comes from the result
summary.
- Reuse the engine's WorkflowFatalError for per-operator errors (the same type
the compiling service returns for compilation errors), replacing the bespoke
OperatorError so compile and execution errors share one wire shape.
- Collapse console messages onto one type; derive warnings from WARNING-titled
messages rather than a separate field.
- Replace OperatorResultSummaryWs: the WS operatorResults payload now carries the
canonical OperatorExecutionSummary; the frontend maps it to its flat display
type (re-flattening sampleTuples to keep the display components unchanged).
Touches the Scala producer (SyncExecutionResource), the agent-service consumers
(result-formatting, workflow-execution-tools, workflow-result-state, server) and
the frontend WS mapping. Representation/type-level change; behavior preserved,
except input-port shape lines are now derived rather than explicitly rendered.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_012qFkyrpTd5PrkNBPcBeo4Q
Reconciliation for rebasing the agent-service-execution-model stack onto origin/main, which had independently merged apache#5751 (extract WS types into types/ws as classes) and apache#5930 (remove the dead checkout/headChange path). Our stack redesigns the same WS layer as interface discriminated unions, so the two diverge structurally. Changes: - Restore the execution-model redesign in types/execution.ts: the 3-way merge re-appended main's apache#5751 flat `OperatorResultSummary` (which references the removed `PortShape` and duplicates our redesigned type), so drop it; our redesign's `OperatorResultSummary`/`OperatorExecutionSummary` supersede it. - Adopt apache#5751 in server.ts: the TexeraAgent websocket surface was renamed `websockets` -> `clients` (getClients/addClient/removeClient) by the auto-merge in texera-agent.ts; align the server call sites, restore the `_getAgentForTests` test accessor, and reject unknown ws message types with an error frame (matching main's handler) instead of silently ignoring them. - Adopt apache#5930: drop the checkout/headChange dead path that our stack still carried — the POST /:id/checkout route + headChange broadcast and the startup-banner token in server.ts, `WsServerHeadChangeMessage` (and its now orphaned WorkflowContent import) in types/ws/server.ts, and the `case "headChange"` handler + `checkoutStep()` in the frontend agent.service. (TexeraAgent.checkout() was already removed by the auto-merge.) - Reconcile tests to the new discriminated-union ws protocol: rewrite server.ws.spec.ts (init/step/state/complete/error frames; prompt/command client frames; operatorResults streamed on init/complete), update the server.spec.ts operator-results test to the OperatorExecutionSummary shape, and fix the frontend stopGeneration assertion to the {type:"command", commandType:"stop"} wire format. agent-service: tsc --noEmit clean, 143/143 bun test pass, prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3a1c181 to
1a2bbb2
Compare
|
Superseded by #6009 — a focused re-do cut directly from |
### What changes were proposed in this PR? Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of apache#5927 cut directly from `main` (no foundation stack): it changes only the execution result/error model and its consumers. - Replace the flat `OperatorInfo` with `OperatorExecutionSummary` (orthogonal sub-summaries: `state`, `errorMessages`, `resultSummary?`, `consoleLogsSummary?`); rename `SyncExecutionResult` → `WorkflowExecutionSummary`. - `resultSummary.sampleTuples` is now `SampleRow[]` (`{ rowIndex, tuple }`) instead of JSON rows with an embedded `__row_index__`; drop the table-shape types (the agent derives input-port shapes from the DAG). - Move `WorkflowFatalError` into `types/execution.ts` and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; `api/compile-api.ts` re-exports it so its existing importers are unchanged. - `errorMessages` / `errors` are non-optional (empty = none); drop `compilationErrors`; collapse the console-message types and derive warnings from `WARNING:`-titled messages. - Operator results are still pulled on demand via `GET /agents/:id/operator-results` (transport unchanged); that REST payload now carries the canonical `OperatorExecutionSummary`, and the frontend maps it to its flat display type (re-flattening `sampleTuples` so the display components are unchanged). Touches the Scala producer (`SyncExecutionResource`), the agent-service consumers (`result-formatting`, `workflow-execution-tools`, `workflow-result-state`, `server`), and the frontend mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered). ### Any related issues, documentation, discussions? Closes apache#5750 Part of apache#5747. Supersedes apache#5927. ### How was this PR tested? - agent-service: `tsc --noEmit` clean, `bun test` 110/110 pass, `prettier --check` clean. - The Scala producer (`SyncExecutionResource`) is unchanged from apache#5927, which verified it via `sbt WorkflowExecutionService/compile` and a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; `/operator-results` returned the new shape — `resultSummary.sampleTuples: [{ rowIndex, tuple }]`, `errorMessages: []`). ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context)
What changes were proposed in this PR?
Stacked on #5751 (foundation). This is the second of the split: it carries the sync-execution result/error model redesign (no overlap with the foundation PR —
types/execution.tsis untouched there).OperatorInfowithOperatorExecutionSummary(orthogonal sub-summaries:state,errorMessages,resultSummary?,consoleLogsSummary?); renameSyncExecutionResult→WorkflowExecutionSummary.resultSummary.sampleTuplesis nowSampleRow[]({ rowIndex, tuple }) instead of JSON rows with an embedded__row_index__; drop the table-shape types (the agent derives input-port shapes from the DAG).WorkflowFatalErrorfor per-operator errors (the same type the compiling service returns), replacing the bespokeOperatorErrorso compile and execution errors share one wire shape.errorMessages/errorsare non-optional (empty = none); dropcompilationErrors; collapse the console-message types and derive warnings fromWARNING:-titled messages.operatorResultspayload carries the canonicalOperatorExecutionSummary; the frontend maps it to its flat display type.Touches the Scala producer (
SyncExecutionResource), the agent-service consumers (result-formatting,workflow-execution-tools,workflow-result-state,server), and the frontend WS mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered).Any related issues, documentation, discussions?
Closes #5750
Part of #5747.
How was this PR tested?
bunx tsc --noEmit,bun test(121 pass / 0 fail),prettier --checkinagent-service;sbt WorkflowExecutionService/compilefor amber./operator-resultsreturned the new shape —resultSummary.sampleTuples: [{ rowIndex, tuple }],errorMessages: []— and the agent rendered the rows correctly.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (1M context)