refactor(agent-service): redesign sync-execution result and error model by bobbai00 · Pull Request #5927 · apache/texera

bobbai00 · 2026-06-24T06:59:56Z

What changes were proposed in this PR?

Stacked on #5751 (foundation). This is the second of the split: it carries the sync-execution result/error model redesign (no overlap with the foundation PR — types/execution.ts is untouched there).

Replace the flat OperatorInfo with OperatorExecutionSummary (orthogonal sub-summaries: state, errorMessages, resultSummary?, consoleLogsSummary?); rename SyncExecutionResult → WorkflowExecutionSummary.
resultSummary.sampleTuples is now SampleRow[] ({ rowIndex, tuple }) instead of JSON rows with an embedded __row_index__; drop the table-shape types (the agent derives input-port shapes from the DAG).
Reuse the engine's WorkflowFatalError for per-operator errors (the same type the compiling service returns), replacing the bespoke OperatorError so compile and execution errors share one wire shape.
errorMessages / errors are non-optional (empty = none); drop compilationErrors; collapse the console-message types and derive warnings from WARNING:-titled messages.
The WS operatorResults payload carries the canonical OperatorExecutionSummary; the frontend maps it to its flat display type.

Touches the Scala producer (SyncExecutionResource), the agent-service consumers (result-formatting, workflow-execution-tools, workflow-result-state, server), and the frontend WS mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered).

Any related issues, documentation, discussions?

Closes #5750
Part of #5747.

How was this PR tested?

bunx tsc --noEmit, bun test (121 pass / 0 fail), prettier --check in agent-service; sbt WorkflowExecutionService/compile for amber.
End-to-end on the full local stack with a Claude Haiku 4.5 agent: it built and executed a CSV workflow; /operator-results returned the new shape — resultSummary.sampleTuples: [{ rowIndex, tuple }], errorMessages: [] — and the agent rendered the rows correctly.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context)

github-actions · 2026-06-24T07:00:18Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

Contributors with relevant context: @Ma77Ball
You can notify them by mentioning @Ma77Ball in a comment.

codecov-commenter · 2026-06-24T07:01:22Z

Codecov Report

❌ Patch coverage is 72.18045% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.14%. Comparing base (dc76abe) to head (1a2bbb2).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
...ervice/src/agent/tools/workflow-execution-tools.ts	6.66%	28 Missing ⚠️
agent-service/src/agent/texera-agent.ts	14.28%	6 Missing ⚠️
agent-service/src/server.ts	96.96%	2 Missing ⚠️
agent-service/src/agent/workflow-result-state.ts	50.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5927      +/-   ##
============================================
+ Coverage     56.61%   57.14%   +0.53%     
  Complexity     3004     3004              
============================================
  Files          1124     1121       -3     
  Lines         43298    43074     -224     
  Branches       4667     4661       -6     
============================================
+ Hits          24511    24613     +102     
+ Misses        17359    17033     -326     
  Partials       1428     1428

Flag	Coverage Δ		*Carryforward flag
access-control-service	`70.00% <ø> (ø)`		Carriedforward from dc76abe
agent-service	`49.71% <72.18%> (+5.11%)`	⬆️
amber	`58.79% <ø> (+0.34%)`	⬆️	Carriedforward from dc76abe
computing-unit-managing-service	`0.00% <ø> (ø)`		Carriedforward from dc76abe
config-service	`52.30% <ø> (ø)`		Carriedforward from dc76abe
file-service	`62.81% <ø> (ø)`		Carriedforward from dc76abe
frontend	`49.44% <ø> (-0.01%)`	⬇️	Carriedforward from dc76abe
notebook-migration-service	`78.57% <ø> (ø)`		Carriedforward from dc76abe
pyamber	`90.20% <ø> (ø)`		Carriedforward from dc76abe
python	`90.76% <ø> (ø)`		Carriedforward from dc76abe
workflow-compiling-service	`55.14% <ø> (ø)`		Carriedforward from dc76abe

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-24T07:05:48Z

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 5 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main 4d05ab2 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	371	0.226	26,098/36,839/36,839 us	🔴 +35.3% / 🔴 +140.3%
🟢	bs=100 sw=10 sl=64	794	0.485	124,892/146,980/146,980 us	🟢 -6.6% / 🔴 +32.9%
⚪	bs=1000 sw=10 sl=64	919	0.561	1,091,299/1,138,112/1,138,112 us	⚪ within ±5% / 🔴 +7.0%

Baseline details

Latest main 4d05ab2 from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	371 tuples/sec	455 tuples/sec	757.16 tuples/sec	-18.5%	-51.0%
bs=10 sw=10 sl=64	MB/s	0.226 MB/s	0.278 MB/s	0.462 MB/s	-18.7%	-51.1%
bs=10 sw=10 sl=64	p50	26,098 us	20,493 us	12,971 us	+27.4%	+101.2%
bs=10 sw=10 sl=64	p95	36,839 us	27,223 us	15,333 us	+35.3%	+140.3%
bs=10 sw=10 sl=64	p99	36,839 us	27,223 us	18,732 us	+35.3%	+96.7%
bs=100 sw=10 sl=64	throughput	794 tuples/sec	802 tuples/sec	957.66 tuples/sec	-1.0%	-17.1%
bs=100 sw=10 sl=64	MB/s	0.485 MB/s	0.49 MB/s	0.585 MB/s	-1.0%	-17.0%
bs=100 sw=10 sl=64	p50	124,892 us	122,887 us	103,982 us	+1.6%	+20.1%
bs=100 sw=10 sl=64	p95	146,980 us	157,373 us	110,583 us	-6.6%	+32.9%
bs=100 sw=10 sl=64	p99	146,980 us	157,373 us	118,369 us	-6.6%	+24.2%
bs=1000 sw=10 sl=64	throughput	919 tuples/sec	926 tuples/sec	979.6 tuples/sec	-0.8%	-6.2%
bs=1000 sw=10 sl=64	MB/s	0.561 MB/s	0.565 MB/s	0.598 MB/s	-0.7%	-6.2%
bs=1000 sw=10 sl=64	p50	1,091,299 us	1,084,586 us	1,024,553 us	+0.6%	+6.5%
bs=1000 sw=10 sl=64	p95	1,138,112 us	1,117,970 us	1,063,789 us	+1.8%	+7.0%
bs=1000 sw=10 sl=64	p99	1,138,112 us	1,117,970 us	1,096,239 us	+1.8%	+3.8%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,539.71,200,128000,371,0.226,26097.63,36838.89,36838.89
1,100,10,64,20,2519.03,2000,1280000,794,0.485,124891.65,146980.48,146980.48
2,1000,10,64,20,21771.76,20000,12800000,919,0.561,1091299.39,1138111.99,1138111.99

…#5930) ### What changes were proposed in this PR? Removes the step-checkout feature from agent-service, which was wired end-to-end but unreachable — nothing in the product ever invokes the frontend `checkoutStep()`, so the backend `/agents/:id/checkout` endpoint and its `headChange` WebSocket broadcast were never triggered. Removing the dead vertical slice: - `agent-service/src/server.ts` — drop the `POST /:id/checkout` route and the `headChange` / `workflowContent` members of the outgoing WebSocket-message type. - `agent-service/src/agent/texera-agent.ts` — drop `TexeraAgent.checkout()`. - `frontend/.../agent/agent.service.ts` — drop the `case "headChange"` WS handler and the unused `checkoutStep()` method. HEAD *tracking* is retained: `init`/`step` messages still carry `headId`, and the chat panel still walks the ancestor path from HEAD to render the visible step chain. Only the backward HEAD move (checkout) is removed. This branches from `main` and is independent of the type-refactor stack (apache#5751 → apache#5928/apache#5927). Note apache#5751 currently lists `headChange` as a server message; whichever merges second will drop it from the union — a trivial reconciliation. ### Any related issues, documentation, discussions? Closes apache#5942 Part of apache#5747. Independent of the stack (branches from `main`). ### How was this PR tested? agent-service: `tsc --noEmit` clean, 93/93 `bun test` pass, prettier clean. Frontend: the removed `checkoutStep()` had zero callers and the `headChange` handler is now unreachable; verified no remaining `checkout`/`headChange` references and no orphaned imports. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…module Foundation slice of the agent-service reorganization (no runtime behavior change): - Add src/types/api.ts (wire DTOs) and src/types/metadata.ts (operator metadata types extracted out of api/backend-api.ts); export both from the types barrel. - Add src/config/endpoints.ts exposing getServiceEndpoints(). - Move src/api/auth-api.ts -> src/auth/jwt.ts (content unchanged) and add src/auth/jwt.test.ts; update auth import paths. - Keep api/backend-api.ts transitional: it now imports/re-exports the metadata types from ../types/metadata and retains getBackendConfig/fetchOperatorMetadata, which the follow-up clients PR relocates. The types/agent.ts and types/workflow.ts reshaping is deferred to the PR that also updates server.ts/workflow-state.ts, since those renames are coupled.

Split types/api.ts into types/wire.ts (backend wire DTOs) and types/ws.ts (this service's own WebSocket frames), and consume them at the real call sites instead of the duplicate definitions that previously lived inline: - server.ts imports WsMessage/WsOutgoingMessage/OperatorResultSummaryWs from types/ws (local copies removed). - api/workflow-api.ts imports Workflow/WorkflowPersistRequest from types/wire. - api/compile-api.ts, agent/util/context-utils.ts, and agent/texera-agent.ts import WorkflowFatalError/WorkflowCompilationResponse from types/wire. Correct WorkflowFatalError to match the backend proto (workflowruntimestate.proto): `type` is the FatalErrorType enum name (string), with details/operatorId/ workerId/timestamp optional, replacing the inaccurate `type: { name }` / all-required shape. Type-level only; consumers read only `.message`, so there is no runtime behavior change. tsc --noEmit, prettier --check, and bun test (101 pass / 0 fail) all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Encode test JWT segments as base64url (not base64) to match real tokens, including a case whose payload contains `-`/`_` to pin url-safe decoding. - Add extractBearerToken coverage: valid bearer, case-insensitive scheme, non-Bearer scheme, missing token, absent header. Addresses the Copilot review comments on src/auth/jwt.test.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Pure file rename plus import-path updates; no type or behavior changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…hema from types/metadata workflow-system-metadata.ts defined its own copies of these two interfaces, identical to the canonical ones in types/metadata.ts. Import the canonical types and delete the duplicates so the centralized definitions are actually used. Type-level only; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…initions Remove all `any` from src/types/*.ts: - opaque/object JSON blobs -> unknown / Record<string, unknown> (jsonSchema, operatorProperties, result rows, sampleRecords, ReActStep tool input/output, physicalPlan, OperatorSchemaInfo/CompactOperatorSchema). - WsOutgoingMessage.workflowContent -> WorkflowContent (it is assigned from WorkflowState.getWorkflowContent()). Type-level only; tsc, prettier, and tests pass with no consumer changes needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…o *.spec.ts - Rename all *.test.ts to *.spec.ts (bun discovers .spec by default; no config change). - Add workflow-system-metadata.spec.ts: schema compaction ($ref inlining, key filtering), getAllSchemasAsJson, Ajv property validation, and the formatting helpers. - Add workflow-api.spec.ts and compile-api.spec.ts: persist/retrieve/compile over a mocked fetch, exercising the proto-accurate WorkflowFatalError shape. 126 pass / 0 fail; tsc and prettier green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ReActStep used `unknown` for fields whose shapes we actually know: - inputMessages -> ModelMessage[] (the prepared AI SDK messages) - toolCalls[].input -> Record<string, unknown> (tool args are JSON objects) - toolResults[].output -> string (every tool returns via createToolResult/createErrorResult) The AI SDK types tc.input / tr.output as `unknown` for dynamically registered tools, so narrow at the single construction boundary in texera-agent.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ted unions Replace the flat src/types/ws.ts with a types/ws/ directory: - client.ts: WsClientRequest = WsClientRequestPrompt | WsClientRequestStopCommand (each extending a base; orthogonal fields instead of one all-optional interface) - server.ts: WsServerMessage union (init/step/state/complete/error/headChange), each declaring only the fields it sends; OperatorResultSummaryWs moves here - index.ts: barrel so types/index.ts's `export * from "./ws"` resolves to the dir The client->server wire discriminator changes from "message"/"stop" to "prompt"/"command" (stop becomes commandType:"stop"); server->client `type` values are unchanged. server.ts parses WsClientRequest and switches on the new shapes, and the frontend agent.service.ts WS sends are updated in lockstep so the protocol stays consistent end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ndation PR The execution-result redesign (OperatorExecutionSummary etc.) lives in a follow-up PR; revert the lone any->unknown tweak so this PR leaves types/execution.ts byte-identical to main.

Restructure the per-operator summary the sync-execution backend returns and the agent-service/frontend consume, for a leaner and consistent wire contract: - Replace flat OperatorInfo with OperatorExecutionSummary: state, errorMessages, resultSummary?, consoleLogsSummary? (orthogonal sub-summaries; no shape stats). - Rename SyncExecutionResult -> WorkflowExecutionSummary; drop compilationErrors (folded into errors). errors and per-op errorMessages are non-optional (empty means none). - OperatorResultSummary.sampleTuples is now List[SampleRow] ({rowIndex, tuple}) instead of a JSON array with an embedded __row_index__. Drop the table-shape types (TableShape/InputPortTableShape): the agent derives input-port shapes from the DAG + each upstream's output shape; output shape comes from the result summary. - Reuse the engine's WorkflowFatalError for per-operator errors (the same type the compiling service returns for compilation errors), replacing the bespoke OperatorError so compile and execution errors share one wire shape. - Collapse console messages onto one type; derive warnings from WARNING-titled messages rather than a separate field. - Replace OperatorResultSummaryWs: the WS operatorResults payload now carries the canonical OperatorExecutionSummary; the frontend maps it to its flat display type (re-flattening sampleTuples to keep the display components unchanged). Touches the Scala producer (SyncExecutionResource), the agent-service consumers (result-formatting, workflow-execution-tools, workflow-result-state, server) and the frontend WS mapping. Representation/type-level change; behavior preserved, except input-port shape lines are now derived rather than explicitly rendered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_012qFkyrpTd5PrkNBPcBeo4Q

Reconciliation for rebasing the agent-service-execution-model stack onto origin/main, which had independently merged apache#5751 (extract WS types into types/ws as classes) and apache#5930 (remove the dead checkout/headChange path). Our stack redesigns the same WS layer as interface discriminated unions, so the two diverge structurally. Changes: - Restore the execution-model redesign in types/execution.ts: the 3-way merge re-appended main's apache#5751 flat `OperatorResultSummary` (which references the removed `PortShape` and duplicates our redesigned type), so drop it; our redesign's `OperatorResultSummary`/`OperatorExecutionSummary` supersede it. - Adopt apache#5751 in server.ts: the TexeraAgent websocket surface was renamed `websockets` -> `clients` (getClients/addClient/removeClient) by the auto-merge in texera-agent.ts; align the server call sites, restore the `_getAgentForTests` test accessor, and reject unknown ws message types with an error frame (matching main's handler) instead of silently ignoring them. - Adopt apache#5930: drop the checkout/headChange dead path that our stack still carried — the POST /:id/checkout route + headChange broadcast and the startup-banner token in server.ts, `WsServerHeadChangeMessage` (and its now orphaned WorkflowContent import) in types/ws/server.ts, and the `case "headChange"` handler + `checkoutStep()` in the frontend agent.service. (TexeraAgent.checkout() was already removed by the auto-merge.) - Reconcile tests to the new discriminated-union ws protocol: rewrite server.ws.spec.ts (init/step/state/complete/error frames; prompt/command client frames; operatorResults streamed on init/complete), update the server.spec.ts operator-results test to the OperatorExecutionSummary shape, and fix the frontend stopGeneration assertion to the {type:"command", commandType:"stop"} wire format. agent-service: tsc --noEmit clean, 143/143 bun test pass, prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

bobbai00 · 2026-06-29T00:41:41Z

Superseded by #6009 — a focused re-do cut directly from main (the execution result/error model only, no foundation stack). Closing this one in favor of the smaller PR.

### What changes were proposed in this PR? Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of apache#5927 cut directly from `main` (no foundation stack): it changes only the execution result/error model and its consumers. - Replace the flat `OperatorInfo` with `OperatorExecutionSummary` (orthogonal sub-summaries: `state`, `errorMessages`, `resultSummary?`, `consoleLogsSummary?`); rename `SyncExecutionResult` → `WorkflowExecutionSummary`. - `resultSummary.sampleTuples` is now `SampleRow[]` (`{ rowIndex, tuple }`) instead of JSON rows with an embedded `__row_index__`; drop the table-shape types (the agent derives input-port shapes from the DAG). - Move `WorkflowFatalError` into `types/execution.ts` and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; `api/compile-api.ts` re-exports it so its existing importers are unchanged. - `errorMessages` / `errors` are non-optional (empty = none); drop `compilationErrors`; collapse the console-message types and derive warnings from `WARNING:`-titled messages. - Operator results are still pulled on demand via `GET /agents/:id/operator-results` (transport unchanged); that REST payload now carries the canonical `OperatorExecutionSummary`, and the frontend maps it to its flat display type (re-flattening `sampleTuples` so the display components are unchanged). Touches the Scala producer (`SyncExecutionResource`), the agent-service consumers (`result-formatting`, `workflow-execution-tools`, `workflow-result-state`, `server`), and the frontend mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered). ### Any related issues, documentation, discussions? Closes apache#5750 Part of apache#5747. Supersedes apache#5927. ### How was this PR tested? - agent-service: `tsc --noEmit` clean, `bun test` 110/110 pass, `prettier --check` clean. - The Scala producer (`SyncExecutionResource`) is unchanged from apache#5927, which verified it via `sbt WorkflowExecutionService/compile` and a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; `/operator-results` returned the new shape — `resultSummary.sampleTuples: [{ rowIndex, tuple }]`, `errorMessages: []`). ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.8 (1M context)

github-actions Bot assigned bobbai00 Jun 24, 2026

github-actions Bot added engine refactor Refactor the code frontend Changes related to the frontend GUI agent-service labels Jun 24, 2026

bobbai00 mentioned this pull request Jun 24, 2026

refactor(agent-service): redesign sync-execution result and error model bobbai00/texera#7

Closed

bobbai00 and others added 12 commits June 28, 2026 16:17

refactor(agent-service): rename types/wire.ts to types/dto.ts

98b021a

Pure file rename plus import-path updates; no type or behavior changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

refactor(agent-service): keep types/execution.ts unchanged in the fou…

2a67909

…ndation PR The execution-result redesign (OperatorExecutionSummary etc.) lives in a follow-up PR; revert the lone any->unknown tweak so this PR leaves types/execution.ts byte-identical to main.

bobbai00 force-pushed the refactor/agent-service-execution-model branch from 3a1c181 to 1a2bbb2 Compare June 29, 2026 00:07

bobbai00 mentioned this pull request Jun 29, 2026

refactor(agent-service): redesign sync-execution result and error model #6009

Open

bobbai00 closed this Jun 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(agent-service): redesign sync-execution result and error model#5927

refactor(agent-service): redesign sync-execution result and error model#5927
bobbai00 wants to merge 12 commits into
apache:mainfrom
bobbai00:refactor/agent-service-execution-model

bobbai00 commented Jun 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jun 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

bobbai00 commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bobbai00 commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Automated Reviewer Suggestions

Uh oh!

codecov-commenter commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Benchmark changes need a look

Uh oh!

bobbai00 commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bobbai00 commented Jun 24, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading

codecov-commenter commented Jun 24, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading