Skip to content

refactor(agent-service): redesign sync-execution result and error model#5927

Closed
bobbai00 wants to merge 12 commits into
apache:mainfrom
bobbai00:refactor/agent-service-execution-model
Closed

refactor(agent-service): redesign sync-execution result and error model#5927
bobbai00 wants to merge 12 commits into
apache:mainfrom
bobbai00:refactor/agent-service-execution-model

Conversation

@bobbai00

@bobbai00 bobbai00 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Stacked on #5751 (foundation). This is the second of the split: it carries the sync-execution result/error model redesign (no overlap with the foundation PR — types/execution.ts is untouched there).

  • Replace the flat OperatorInfo with OperatorExecutionSummary (orthogonal sub-summaries: state, errorMessages, resultSummary?, consoleLogsSummary?); rename SyncExecutionResultWorkflowExecutionSummary.
  • resultSummary.sampleTuples is now SampleRow[] ({ rowIndex, tuple }) instead of JSON rows with an embedded __row_index__; drop the table-shape types (the agent derives input-port shapes from the DAG).
  • Reuse the engine's WorkflowFatalError for per-operator errors (the same type the compiling service returns), replacing the bespoke OperatorError so compile and execution errors share one wire shape.
  • errorMessages / errors are non-optional (empty = none); drop compilationErrors; collapse the console-message types and derive warnings from WARNING:-titled messages.
  • The WS operatorResults payload carries the canonical OperatorExecutionSummary; the frontend maps it to its flat display type.

Touches the Scala producer (SyncExecutionResource), the agent-service consumers (result-formatting, workflow-execution-tools, workflow-result-state, server), and the frontend WS mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered).

Any related issues, documentation, discussions?

Closes #5750
Part of #5747.

How was this PR tested?

  • bunx tsc --noEmit, bun test (121 pass / 0 fail), prettier --check in agent-service; sbt WorkflowExecutionService/compile for amber.
  • End-to-end on the full local stack with a Claude Haiku 4.5 agent: it built and executed a CSV workflow; /operator-results returned the new shape — resultSummary.sampleTuples: [{ rowIndex, tuple }], errorMessages: [] — and the agent rendered the rows correctly.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context)

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @Ma77Ball
    You can notify them by mentioning @Ma77Ball in a comment.

@codecov-commenter

codecov-commenter commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 72.18045% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.14%. Comparing base (dc76abe) to head (1a2bbb2).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...ervice/src/agent/tools/workflow-execution-tools.ts 6.66% 28 Missing ⚠️
agent-service/src/agent/texera-agent.ts 14.28% 6 Missing ⚠️
agent-service/src/server.ts 96.96% 2 Missing ⚠️
agent-service/src/agent/workflow-result-state.ts 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5927      +/-   ##
============================================
+ Coverage     56.61%   57.14%   +0.53%     
  Complexity     3004     3004              
============================================
  Files          1124     1121       -3     
  Lines         43298    43074     -224     
  Branches       4667     4661       -6     
============================================
+ Hits          24511    24613     +102     
+ Misses        17359    17033     -326     
  Partials       1428     1428              
Flag Coverage Δ *Carryforward flag
access-control-service 70.00% <ø> (ø) Carriedforward from dc76abe
agent-service 49.71% <72.18%> (+5.11%) ⬆️
amber 58.79% <ø> (+0.34%) ⬆️ Carriedforward from dc76abe
computing-unit-managing-service 0.00% <ø> (ø) Carriedforward from dc76abe
config-service 52.30% <ø> (ø) Carriedforward from dc76abe
file-service 62.81% <ø> (ø) Carriedforward from dc76abe
frontend 49.44% <ø> (-0.01%) ⬇️ Carriedforward from dc76abe
notebook-migration-service 78.57% <ø> (ø) Carriedforward from dc76abe
pyamber 90.20% <ø> (ø) Carriedforward from dc76abe
python 90.76% <ø> (ø) Carriedforward from dc76abe
workflow-compiling-service 55.14% <ø> (ø) Carriedforward from dc76abe

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 5 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main 4d05ab2 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 371 0.226 26,098/36,839/36,839 us 🔴 +35.3% / 🔴 +140.3%
🟢 bs=100 sw=10 sl=64 794 0.485 124,892/146,980/146,980 us 🟢 -6.6% / 🔴 +32.9%
bs=1000 sw=10 sl=64 919 0.561 1,091,299/1,138,112/1,138,112 us ⚪ within ±5% / 🔴 +7.0%
Baseline details

Latest main 4d05ab2 from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 371 tuples/sec 455 tuples/sec 757.16 tuples/sec -18.5% -51.0%
bs=10 sw=10 sl=64 MB/s 0.226 MB/s 0.278 MB/s 0.462 MB/s -18.7% -51.1%
bs=10 sw=10 sl=64 p50 26,098 us 20,493 us 12,971 us +27.4% +101.2%
bs=10 sw=10 sl=64 p95 36,839 us 27,223 us 15,333 us +35.3% +140.3%
bs=10 sw=10 sl=64 p99 36,839 us 27,223 us 18,732 us +35.3% +96.7%
bs=100 sw=10 sl=64 throughput 794 tuples/sec 802 tuples/sec 957.66 tuples/sec -1.0% -17.1%
bs=100 sw=10 sl=64 MB/s 0.485 MB/s 0.49 MB/s 0.585 MB/s -1.0% -17.0%
bs=100 sw=10 sl=64 p50 124,892 us 122,887 us 103,982 us +1.6% +20.1%
bs=100 sw=10 sl=64 p95 146,980 us 157,373 us 110,583 us -6.6% +32.9%
bs=100 sw=10 sl=64 p99 146,980 us 157,373 us 118,369 us -6.6% +24.2%
bs=1000 sw=10 sl=64 throughput 919 tuples/sec 926 tuples/sec 979.6 tuples/sec -0.8% -6.2%
bs=1000 sw=10 sl=64 MB/s 0.561 MB/s 0.565 MB/s 0.598 MB/s -0.7% -6.2%
bs=1000 sw=10 sl=64 p50 1,091,299 us 1,084,586 us 1,024,553 us +0.6% +6.5%
bs=1000 sw=10 sl=64 p95 1,138,112 us 1,117,970 us 1,063,789 us +1.8% +7.0%
bs=1000 sw=10 sl=64 p99 1,138,112 us 1,117,970 us 1,096,239 us +1.8% +3.8%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,539.71,200,128000,371,0.226,26097.63,36838.89,36838.89
1,100,10,64,20,2519.03,2000,1280000,794,0.485,124891.65,146980.48,146980.48
2,1000,10,64,20,21771.76,20000,12800000,919,0.561,1091299.39,1138111.99,1138111.99

bobbai00 added a commit to bobbai00/texera that referenced this pull request Jun 28, 2026
…#5930)

### What changes were proposed in this PR?

Removes the step-checkout feature from agent-service, which was wired
end-to-end but unreachable — nothing in the product ever invokes the
frontend `checkoutStep()`, so the backend `/agents/:id/checkout`
endpoint and its `headChange` WebSocket broadcast were never triggered.
Removing the dead vertical slice:

- `agent-service/src/server.ts` — drop the `POST /:id/checkout` route
and the `headChange` / `workflowContent` members of the outgoing
WebSocket-message type.
- `agent-service/src/agent/texera-agent.ts` — drop
`TexeraAgent.checkout()`.
- `frontend/.../agent/agent.service.ts` — drop the `case "headChange"`
WS handler and the unused `checkoutStep()` method.

HEAD *tracking* is retained: `init`/`step` messages still carry
`headId`, and the chat panel still walks the ancestor path from HEAD to
render the visible step chain. Only the backward HEAD move (checkout) is
removed.

This branches from `main` and is independent of the type-refactor stack
(apache#5751apache#5928/apache#5927). Note apache#5751 currently lists `headChange` as a
server message; whichever merges second will drop it from the union — a
trivial reconciliation.

### Any related issues, documentation, discussions?

Closes apache#5942
Part of apache#5747. Independent of the stack (branches from `main`).

### How was this PR tested?

agent-service: `tsc --noEmit` clean, 93/93 `bun test` pass, prettier
clean. Frontend: the removed `checkoutStep()` had zero callers and the
`headChange` handler is now unreachable; verified no remaining
`checkout`/`headChange` references and no orphaned imports.

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bobbai00 and others added 12 commits June 28, 2026 16:17
…module

Foundation slice of the agent-service reorganization (no runtime behavior
change):

- Add src/types/api.ts (wire DTOs) and src/types/metadata.ts (operator
  metadata types extracted out of api/backend-api.ts); export both from the
  types barrel.
- Add src/config/endpoints.ts exposing getServiceEndpoints().
- Move src/api/auth-api.ts -> src/auth/jwt.ts (content unchanged) and add
  src/auth/jwt.test.ts; update auth import paths.
- Keep api/backend-api.ts transitional: it now imports/re-exports the metadata
  types from ../types/metadata and retains getBackendConfig/fetchOperatorMetadata,
  which the follow-up clients PR relocates.

The types/agent.ts and types/workflow.ts reshaping is deferred to the PR that
also updates server.ts/workflow-state.ts, since those renames are coupled.
Split types/api.ts into types/wire.ts (backend wire DTOs) and types/ws.ts
(this service's own WebSocket frames), and consume them at the real call
sites instead of the duplicate definitions that previously lived inline:

- server.ts imports WsMessage/WsOutgoingMessage/OperatorResultSummaryWs from
  types/ws (local copies removed).
- api/workflow-api.ts imports Workflow/WorkflowPersistRequest from types/wire.
- api/compile-api.ts, agent/util/context-utils.ts, and agent/texera-agent.ts
  import WorkflowFatalError/WorkflowCompilationResponse from types/wire.

Correct WorkflowFatalError to match the backend proto (workflowruntimestate.proto):
`type` is the FatalErrorType enum name (string), with details/operatorId/
workerId/timestamp optional, replacing the inaccurate `type: { name }` /
all-required shape. Type-level only; consumers read only `.message`, so there
is no runtime behavior change.

tsc --noEmit, prettier --check, and bun test (101 pass / 0 fail) all green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Encode test JWT segments as base64url (not base64) to match real tokens,
  including a case whose payload contains `-`/`_` to pin url-safe decoding.
- Add extractBearerToken coverage: valid bearer, case-insensitive scheme,
  non-Bearer scheme, missing token, absent header.

Addresses the Copilot review comments on src/auth/jwt.test.ts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pure file rename plus import-path updates; no type or behavior changes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hema from types/metadata

workflow-system-metadata.ts defined its own copies of these two interfaces,
identical to the canonical ones in types/metadata.ts. Import the canonical
types and delete the duplicates so the centralized definitions are actually
used. Type-level only; no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…initions

Remove all `any` from src/types/*.ts:
- opaque/object JSON blobs -> unknown / Record<string, unknown> (jsonSchema,
  operatorProperties, result rows, sampleRecords, ReActStep tool input/output,
  physicalPlan, OperatorSchemaInfo/CompactOperatorSchema).
- WsOutgoingMessage.workflowContent -> WorkflowContent (it is assigned from
  WorkflowState.getWorkflowContent()).

Type-level only; tsc, prettier, and tests pass with no consumer changes needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o *.spec.ts

- Rename all *.test.ts to *.spec.ts (bun discovers .spec by default; no config change).
- Add workflow-system-metadata.spec.ts: schema compaction ($ref inlining, key
  filtering), getAllSchemasAsJson, Ajv property validation, and the formatting helpers.
- Add workflow-api.spec.ts and compile-api.spec.ts: persist/retrieve/compile over a
  mocked fetch, exercising the proto-accurate WorkflowFatalError shape.

126 pass / 0 fail; tsc and prettier green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ReActStep used `unknown` for fields whose shapes we actually know:
- inputMessages -> ModelMessage[] (the prepared AI SDK messages)
- toolCalls[].input -> Record<string, unknown> (tool args are JSON objects)
- toolResults[].output -> string (every tool returns via createToolResult/createErrorResult)

The AI SDK types tc.input / tr.output as `unknown` for dynamically registered
tools, so narrow at the single construction boundary in texera-agent.ts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ted unions

Replace the flat src/types/ws.ts with a types/ws/ directory:
- client.ts: WsClientRequest = WsClientRequestPrompt | WsClientRequestStopCommand
  (each extending a base; orthogonal fields instead of one all-optional interface)
- server.ts: WsServerMessage union (init/step/state/complete/error/headChange),
  each declaring only the fields it sends; OperatorResultSummaryWs moves here
- index.ts: barrel so types/index.ts's `export * from "./ws"` resolves to the dir

The client->server wire discriminator changes from "message"/"stop" to
"prompt"/"command" (stop becomes commandType:"stop"); server->client `type`
values are unchanged. server.ts parses WsClientRequest and switches on the new
shapes, and the frontend agent.service.ts WS sends are updated in lockstep so
the protocol stays consistent end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ndation PR

The execution-result redesign (OperatorExecutionSummary etc.) lives in a
follow-up PR; revert the lone any->unknown tweak so this PR leaves
types/execution.ts byte-identical to main.
Restructure the per-operator summary the sync-execution backend returns and the
agent-service/frontend consume, for a leaner and consistent wire contract:

- Replace flat OperatorInfo with OperatorExecutionSummary: state, errorMessages,
  resultSummary?, consoleLogsSummary? (orthogonal sub-summaries; no shape stats).
- Rename SyncExecutionResult -> WorkflowExecutionSummary; drop compilationErrors
  (folded into errors). errors and per-op errorMessages are non-optional (empty
  means none).
- OperatorResultSummary.sampleTuples is now List[SampleRow] ({rowIndex, tuple})
  instead of a JSON array with an embedded __row_index__. Drop the table-shape
  types (TableShape/InputPortTableShape): the agent derives input-port shapes
  from the DAG + each upstream's output shape; output shape comes from the result
  summary.
- Reuse the engine's WorkflowFatalError for per-operator errors (the same type
  the compiling service returns for compilation errors), replacing the bespoke
  OperatorError so compile and execution errors share one wire shape.
- Collapse console messages onto one type; derive warnings from WARNING-titled
  messages rather than a separate field.
- Replace OperatorResultSummaryWs: the WS operatorResults payload now carries the
  canonical OperatorExecutionSummary; the frontend maps it to its flat display
  type (re-flattening sampleTuples to keep the display components unchanged).

Touches the Scala producer (SyncExecutionResource), the agent-service consumers
(result-formatting, workflow-execution-tools, workflow-result-state, server) and
the frontend WS mapping. Representation/type-level change; behavior preserved,
except input-port shape lines are now derived rather than explicitly rendered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_012qFkyrpTd5PrkNBPcBeo4Q
Reconciliation for rebasing the agent-service-execution-model stack onto
origin/main, which had independently merged apache#5751 (extract WS types into
types/ws as classes) and apache#5930 (remove the dead checkout/headChange path).
Our stack redesigns the same WS layer as interface discriminated unions, so
the two diverge structurally. Changes:

- Restore the execution-model redesign in types/execution.ts: the 3-way merge
  re-appended main's apache#5751 flat `OperatorResultSummary` (which references the
  removed `PortShape` and duplicates our redesigned type), so drop it; our
  redesign's `OperatorResultSummary`/`OperatorExecutionSummary` supersede it.
- Adopt apache#5751 in server.ts: the TexeraAgent websocket surface was renamed
  `websockets` -> `clients` (getClients/addClient/removeClient) by the
  auto-merge in texera-agent.ts; align the server call sites, restore the
  `_getAgentForTests` test accessor, and reject unknown ws message types with
  an error frame (matching main's handler) instead of silently ignoring them.
- Adopt apache#5930: drop the checkout/headChange dead path that our stack still
  carried — the POST /:id/checkout route + headChange broadcast and the
  startup-banner token in server.ts, `WsServerHeadChangeMessage` (and its now
  orphaned WorkflowContent import) in types/ws/server.ts, and the
  `case "headChange"` handler + `checkoutStep()` in the frontend agent.service.
  (TexeraAgent.checkout() was already removed by the auto-merge.)
- Reconcile tests to the new discriminated-union ws protocol: rewrite
  server.ws.spec.ts (init/step/state/complete/error frames; prompt/command
  client frames; operatorResults streamed on init/complete), update the
  server.spec.ts operator-results test to the OperatorExecutionSummary shape,
  and fix the frontend stopGeneration assertion to the {type:"command",
  commandType:"stop"} wire format.

agent-service: tsc --noEmit clean, 143/143 bun test pass, prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bobbai00

Copy link
Copy Markdown
Contributor Author

Superseded by #6009 — a focused re-do cut directly from main (the execution result/error model only, no foundation stack). Closing this one in favor of the smaller PR.

@bobbai00 bobbai00 closed this Jun 29, 2026
bobbai00 added a commit to bobbai00/texera that referenced this pull request Jun 29, 2026
### What changes were proposed in this PR?

Restructures the per-operator summary the sync-execution backend returns and the agent-service / frontend consume, for a leaner, consistent wire contract. This is a focused re-do of apache#5927 cut directly from `main` (no foundation stack): it changes only the execution result/error model and its consumers.

- Replace the flat `OperatorInfo` with `OperatorExecutionSummary` (orthogonal sub-summaries: `state`, `errorMessages`, `resultSummary?`, `consoleLogsSummary?`); rename `SyncExecutionResult` → `WorkflowExecutionSummary`.
- `resultSummary.sampleTuples` is now `SampleRow[]` (`{ rowIndex, tuple }`) instead of JSON rows with an embedded `__row_index__`; drop the table-shape types (the agent derives input-port shapes from the DAG).
- Move `WorkflowFatalError` into `types/execution.ts` and reuse it for per-operator errors — the same type the workflow-compiling service returns for compilation errors, so compile and execution errors share one wire shape; `api/compile-api.ts` re-exports it so its existing importers are unchanged.
- `errorMessages` / `errors` are non-optional (empty = none); drop `compilationErrors`; collapse the console-message types and derive warnings from `WARNING:`-titled messages.
- Operator results are still pulled on demand via `GET /agents/:id/operator-results` (transport unchanged); that REST payload now carries the canonical `OperatorExecutionSummary`, and the frontend maps it to its flat display type (re-flattening `sampleTuples` so the display components are unchanged).

Touches the Scala producer (`SyncExecutionResource`), the agent-service consumers (`result-formatting`, `workflow-execution-tools`, `workflow-result-state`, `server`), and the frontend mapping. Representation/type-level; behavior preserved (input-port shape lines are now derived rather than explicitly rendered).

### Any related issues, documentation, discussions?

Closes apache#5750
Part of apache#5747.
Supersedes apache#5927.

### How was this PR tested?

- agent-service: `tsc --noEmit` clean, `bun test` 110/110 pass, `prettier --check` clean.
- The Scala producer (`SyncExecutionResource`) is unchanged from apache#5927, which verified it via `sbt WorkflowExecutionService/compile` and a full-stack end-to-end run (a Claude Haiku 4.5 agent built and executed a CSV workflow; `/operator-results` returned the new shape — `resultSummary.sampleTuples: [{ rowIndex, tuple }]`, `errorMessages: []`).

### Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-service engine frontend Changes related to the frontend GUI refactor Refactor the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(agent-service): redesign the sync-execution result and error model

2 participants