Skip to content

feat(agents-mobile): schema-driven spawn args, model controls & image attachments (desktop parity)#4553

Merged
msfstef merged 11 commits into
mainfrom
msfstef/enhance-spawn-args-mobile-desktop-parity
Jun 11, 2026
Merged

feat(agents-mobile): schema-driven spawn args, model controls & image attachments (desktop parity)#4553
msfstef merged 11 commits into
mainfrom
msfstef/enhance-spawn-args-mobile-desktop-parity

Conversation

@msfstef

@msfstef msfstef commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Brings agents-mobile to parity with the desktop agents app for four composer / spawn capabilities, building on #4533 (the native mobile slash-command composer):

  • Schema-driven new-session args — render an agent's creation_schema as native controls (feature 7)
  • Model / reasoning / speed controls at spawn (feature 8)
  • Image attachments in the in-session chat composer (feature 6)
  • Image attachments at spawn time (feature 9)

No server / runtime changes were required. Every wire contract already exists; this PR makes the mobile client produce the same payloads desktop does, plus a small set of shared-UI fixes that device testing surfaced.

Why

#4533 reached slash-command composer parity on mobile, but the composer was otherwise text-only. The four gaps above are all implemented on desktop in agents-server-ui; this closes them on mobile.

Key insight: 4 features → 2 capabilities, 0 server changes

  1. Features 7 + 8 are the same mechanism. Model / reasoning / speed are not special server concepts — they're ordinary enum properties of the entity type's creation_schema, detected by field-name convention. creation_schema is already synced to mobile and spawnEntity already forwards args, so this is purely additive native UI over data mobile already has.
  2. Features 6 + 9 share one path. The shared send action already threads attachments end-to-end; the only blocker was that uploadMessageAttachments used browser File / FormData.set, which React Native lacks.

What's included

Capability A — schema-driven spawn args + model/reasoning/speed (7, 8)

  • New mobile: lib/spawnArgs.ts, lib/lastPickedModel.ts, components/SchemaArgsControls.tsx, wired into NewSessionScreen.
  • Renders the synced creation_schema natively, full parity with desktop SchemaForm: enum → BottomSheet picker pill (model grouped by provider, last pick remembered in AsyncStorage), boolean → Switch, string/number → text field, string-array → comma field, object → JSON field. Field labels are humanized (reasoningEffort → "Reasoning Effort") and provider names mapped (openai → "OpenAI"). Required fields gate "Start session"; server still validates against the schema.
  • "Start session" is pinned to the bottom of the new-session screen (outside the scroll) so it stays reachable as the schema / model / sandbox sections stack up. The bar reports its height via onLayout and the scroll content is padded by that height so every section still scrolls clear; the bar owns its safe-area bottom inset (Screen leaves the bottom to screen-specific controls, matching the in-session composer).

Capability B — image attachments (6, 9)

  • New mobile: lib/attachments.ts + attach button / thumbnail tray in NativeComposer; image/camera icons.
  • Deps: expo-image-picker (library + camera; iOS uses a native ActionSheetIOS), expo-image-manipulator (transcode to JPEG).
  • In-session reuses the shared createSendComposerInputAction({ attachments }). At spawn it mirrors desktop doSpawn (spawn without initialMessage, then send immediate with attachments). Gated on whether the session's model supports image input (schemaModelSupportsImageInput).
  • Display is handled by the existing desktop chat-log WebView embed — no native timeline rendering was needed; once a mobile-sent attachment syncs, the embedded renderer shows the thumbnail.

Shared / cross-cutting

  • agents-server-ui/lib/schemaProperties.ts (new): the DOM-free schema-classification helpers (inline props, model/reasoning/speed detection + grouping, string-array/JSON parsing, object-schema guards) extracted from SchemaForm/NewSessionView so desktop and the native mobile composer share one source of truth — now including the model-provider label map + provider:model id parsing that were otherwise mirrored on each platform. Behaviour-preserving for desktop.
  • agents-server-ui/lib/sendMessage.ts: uploadMessageAttachments accepts File | NativeFileDescriptor and branches FormData.append (RN) vs .set (web). No wire-format change.
  • Embed / timeline fixes (device testing): WorkspaceProvider around the embed router so the state-inspector view's useWorkspace resolves; image-preview dialog respects the composer inset; iOS timeline thumbnail containment.
  • agents/horton.ts: robust session-title generation for image/attachment messages.

Key decisions (the "why", for future sessions)

  • Extract shared schema helpers rather than duplicate. Mobile reuses agents-server-ui pure logic + agents-runtime/client, and builds native UI itself (the established pattern). The desktop side of the refactor is mostly deletion.
  • Attachment display via the existing WebView embed, not native rendering — the desktop chat log already renders attachments; mobile just needs to send them.
  • iOS picker via ActionSheetIOS, not our BottomSheet. Presenting the native image picker over an RN Modal freezes / fails to open on iOS; the native action sheet has no such conflict. Android keeps the BottomSheet (intent-based picker, no conflict).
  • Transcode every picked image to JPEG. iOS returns HEIC, which Anthropic/OpenAI vision models reject (the agent run errored with finish_reason=error); expo-image-manipulator normalizes to JPEG.
  • Title generation hardening. The low-cost, text-only title model went conversational on image messages it couldn't see ("I'm sorry but no images were actually shared…") and that became the title — firmer prompt + a guard that rejects sentence-like responses and falls back to the local title.

Known limitations / follow-ups

  • Android keyboard: the composer is position: absolute; bottom: 0 and relies on Android window resize to lift above the IME; the iOS-style translate is intentionally not applied on Android (doing so double-offsets — device testing showed a 2× lift, proving resize is active). If a device ever shows the composer covered under the keyboard, the deterministic fix is android:windowSoftInputMode="adjustNothing" (config plugin) + translate-only — that needs a native rebuild. Rationale is in the SessionScreen composer-transform comment.
  • Image-preview bottom gap is a tuned constant in AttachmentImagePreviewDialog.module.css (36px = 16 gap + 20 CHAT_COMPOSER_OVERLAP compensation); nudge if a device shows too much / too little gap.
  • Title fix only affects newly-created chats (existing bad titles don't change retroactively).
  • Attachment policy is image-only (desktop parity); arbitrary documents are out of scope.

Testing

  • Unit: schemaProperties (agents-server-ui), spawnArgs (agents-mobile), generate-title (agents) — all green. Typecheck clean across agents-server-ui, agents-mobile, agents-desktop, agents; expo-doctor 18/18.
  • Manual (iOS + Android dev builds): schema controls + model/reasoning pickers, required-field gating, attach from library/camera, spawn-with-attachments + in-session send, HEIC handling, image-input gating, timeline display, image-preview layering, keyboard behaviour.

Build / deploy notes

  • Native modules added (expo-image-picker, expo-image-manipulator) + an expo-image-picker config plugin (iOS photo/camera usage strings) → requires a native dev build (not Expo Go; a Metro reload is not enough).
  • Embed / CSS changes (agents-server-ui) ship in the WebView DOM bundle → Metro reload.
  • Title generation runs server-side in agents (the horton handler) → restart / redeploy the agents server / runner.

🤖 Generated with Claude Code

@msfstef msfstef added the claude label Jun 10, 2026
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Electric Agents Desktop Builds

Build artifacts for commit 725a5bf.

Platform Status Artifact
macOS Apple Silicon Passed DMG
macOS Intel Passed DMG
Windows x64 Passed Installer
Linux x64 Passed AppImage / deb

Workflow run

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 79.34272% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.07%. Comparing base (b8875a2) to head (725a5bf).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/agents-server-ui/src/lib/sendMessage.ts 0.00% 13 Missing ⚠️
packages/agents-mobile/src/lib/spawnArgs.ts 90.21% 8 Missing and 1 partial ⚠️
packages/agents-server-ui/src/embed/EmbedApp.tsx 0.00% 7 Missing ⚠️
...kages/agents-server-ui/src/lib/schemaProperties.ts 92.40% 6 Missing ⚠️
packages/agents-mobile/src/lib/lastPickedModel.ts 63.63% 3 Missing and 1 partial ⚠️
...es/agents-server-ui/src/components/UserMessage.tsx 0.00% 3 Missing ⚠️
...-server-ui/src/components/views/NewSessionView.tsx 0.00% 1 Missing ⚠️
packages/agents/src/agents/horton.ts 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #4553       +/-   ##
===========================================
- Coverage   72.77%   58.07%   -14.71%     
===========================================
  Files          86      369      +283     
  Lines        9779    40459    +30680     
  Branches     2982    11470     +8488     
===========================================
+ Hits         7117    23497    +16380     
- Misses       2608    16888    +14280     
- Partials       54       74       +20     
Flag Coverage Δ
packages/agents 71.37% <85.71%> (?)
packages/agents-mcp 77.54% <ø> (?)
packages/agents-mobile 75.49% <87.37%> (?)
packages/agents-runtime 82.13% <ø> (?)
packages/agents-server 74.80% <ø> (-0.05%) ⬇️
packages/agents-server-ui 6.25% <70.87%> (?)
packages/electric-ax 46.42% <ø> (ø)
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (?)
packages/typescript-client 91.83% <ø> (+0.11%) ⬆️
packages/y-electric 56.05% <ø> (ø)
typescript 58.07% <79.34%> (-14.71%) ⬇️
unit-tests 58.07% <79.34%> (-14.71%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Electric Agents Mobile Build

Local mobile checks ran for commit 725a5bf.

The EAS Android preview build was skipped because the mobile-eas-build label is not present.
Add the mobile-eas-build label to this PR to produce an installable preview build.

Workflow run

@claude

claude Bot commented Jun 10, 2026

Copy link
Copy Markdown

Claude Code Review

Summary

Iteration 6 is a real content update, not a rebase: commit 725a5bf6a ("review fixes for spawn args, attachments, titles") landed on top of the iteration-5 head and addresses several correctness edges in the spawn-args pipeline plus the title guard. The changes are small, well-targeted, and each ships with a test. I reviewed the diff d4bed46b1..725a5bf6a and find it clean — no new issues.

What's Working Well

  • Optional vs required enum seeding now matches desktop. buildInitialSpawnArgs only auto-seeds the first enum option for required enums (spawnArgs.ts:117), and SchemaArgsControls grows a clearable None item for optional ones (SchemaArgsControls.tsx:254). This stops mobile from silently sending a value the user never chose (e.g. reasoningEffort), which was a genuine parity gap with the desktop SchemaForm's clearable .
  • Numeric finalize reconciliation closes a real hole. coerceTextFieldValue deliberately keeps non-round-tripping input (0.50, 2., 1e3) as raw text so the field stays editable — but previously that raw string would have been sent as the spawn arg. finalizeSpawnArgs now re-coerces it to a number on submit (spawnArgs.ts:170-182), with unparseable input passing through for the server to reject. The test covers all three cases.
  • Required-field gating is now array-aware. hasMissingRequiredArgs treats empty arrays and separator-only string-array text (" , " parses to [], which finalize would drop) as missing (spawnArgs.ts:200-211). Good catch — without this the Start button would enable for a required tag list that resolves to nothing.
  • Permission-gate removal is the correct call. Dropping requestMediaLibraryPermissionsAsync for the library picker (attachments.ts:73-83) is right: launchImageLibraryAsync uses PHPicker / the Android Photo Picker, which need no media-library grant, so the old gate only dead-ended users who'd once denied it. ensureGranted is still used by the camera path, so no dead code.
  • Title guard hardening. Extending looksLikeNonTitle to reject !?, (horton.ts:179) catches short apologies that slip under the 8-word cap, while leaving dots/hyphens legal for technical titles. The fallback is the safe locally-derived title, so a rare legitimately-comma'd title just degrades gracefully.

Issues Found

Critical (Must Fix)

None.

Important (Should Fix)

None.

Suggestions (Nice to Have)

  • Integer coercion via parseInt is lossy for a couple of exotic inputs. In finalizeSpawnArgs (spawnArgs.ts:177), an integer field with raw text like "2.5" becomes 2 and "1e3" becomes 1 (parseInt stops at ./e), so a clearly-invalid value is silently "fixed" rather than passed through for the server to reject — unlike the number branch, which round-trips 1e3 correctly via Number(). This is a genuine edge (the value only reaches finalize because it failed coerceTextFieldValue's round-trip check) and the server remains the source of truth, so it's not worth blocking on. If you want exactness, mirror the number branch (const n = Number(value)) and let Number.isInteger(n) decide.

  • ScrollView flex: 1 (carried from iteration 4 — still your call). NewSessionScreen.tsx:317 still sets only contentContainerStyle, diverging from the OnboardingScreen.tsx pinned-footer pattern. With the demo videos covering the tall-content case and this being a short-content / consistency nit at most, I'm leaving it as informational only.

Issue Conformance

No linked issue (consistent across all iterations) — soft warning per convention. No scope creep: every change in this commit traces to a review item or a parity gap with the desktop composer. The changeset wording was also tightened to call out the title hardening as the sole server-side behavior change.

Previous Review Status

  • ✅ All iteration 1–3 items remain resolved.
  • ✅ Iteration 6 adds substantive review-fix work (optional-enum seeding, numeric finalize, array gating, permission gate, title punctuation guard) — each with test coverage.
  • 🟡 The iteration-4 flex: 1 note is still open in code, now downgraded to informational given the demo videos.
  • No regressions or new issues introduced.

Review iteration: 6 | 2026-06-11

@msfstef

msfstef commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Review follow-up (8446ef4)

Thanks for the review. Addressed both actionable items; documenting the two suggestions we intentionally left as-is.

Addressed

  • Changeset coverage for @electric-ax/agents — added the agents patch entry + a paragraph describing the title-generation hardening. The horton.ts change is a real behavioral change in a publishable package, so it now ships with a version bump and changelog entry instead of silently.
  • buildInitialSpawnArgs now respects omitKeys — it was the only function in the spawn-args pipeline that didn't. Threaded omitKeys through both seeding loops, passed SCHEMA_OMIT_KEYS at the call site, and added a test covering an inline enum and a defaulted text field. Harmless today (horton's workingDirectory has no default), but removes the latent leak you described.

Deliberately not changed

  • creationSchema omitted from the seeding useEffect deps — this is intentional, not an oversight. Keying the reseed on activeTypeName (not the schema object) is exactly what stops a types-collection re-sync from clobbering in-progress edits; adding creationSchema to the deps would reintroduce that bug. The trade-off you noted (a server-side schema edit while the same type stays selected won't reseed) is the lesser evil and acceptable. Left a comment at the call site already states this intent. We confirmed lint is not flagging it.

  • Required-field gating passing on raw invalid-JSON text — intentionally matched to desktop SchemaForm: any non-empty value passes the client gate and the server is the source of truth for schema validation. Diverging here would put mobile out of parity and duplicate validation logic the server already owns. Noted as parity-consistent in the review, and we agree it should stay that way.

Note on the "missing changeset / unrelated 0.4.15 entries"

The unrelated agents 0.4.15 CHANGELOG entries (undici/fork/schedule/docker) seen in earlier diffs were a stale-local-main artifact, not something this branch introduces — current main already contains that release-publish commit as this PR's base (the branch is stacked directly on #4533). Verified: vs current main, this branch's only packages/agents/ changes are horton.ts + generate-title.test.ts.

@msfstef

msfstef commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author
Simulator.Screen.Recording.-.iPhone.15.Pro.-.2026-06-10.at.14.49.10.mov
android-photo.webm

@msfstef msfstef marked this pull request as ready for review June 10, 2026 12:37
@msfstef msfstef requested review from kevin-dp and samwillis June 11, 2026 08:13
msfstef and others added 10 commits June 11, 2026 11:14
…ttings helpers

Move the creation_schema classification helpers (inlineSchemaProperties, model/reasoning/speed detection + grouping, string-array/JSON parsing, object-schema guards) out of NewSessionView/SchemaForm into a DOM-free lib/schemaProperties module so the native mobile composer can reuse the exact same rules. Behaviour-preserving for desktop; the two model-settings memos collapse into one groupModelSettings call and SchemaForm imports the moved guards.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…end path

uploadMessageAttachments now takes File | NativeFileDescriptor ({uri,name,type}) and branches on the part shape: the web path keeps FormData.set + File, the native path uses FormData.append (RN's FormData has no .set) and serializes the {uri,name,type} object. The new types thread through sendEntityMessage / createSendMessageAction / createSendComposerInputAction so the mobile composer can reuse the same optimistic send+upload+rollback orchestration. No wire-format change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eed controls

Render an entity type's synced creation_schema as native spawn-arg controls (parity with the desktop SchemaForm/DefaultAgentComposer): enums become BottomSheet picker pills (the model enum groups by provider and remembers the last pick via AsyncStorage), booleans become switches, string/number become text fields, string-arrays a comma field, and objects a JSON field. Field labels are humanized (reasoningEffort -> Reasoning Effort) and provider names mapped (openai -> OpenAI). spawnArgs derives initial values/defaults and finalizes them for the spawn payload; reuses the shared schemaProperties helpers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…osers

Add image attachments (photo library / camera via expo-image-picker, iOS uses a native ActionSheet; transcoded to JPEG via expo-image-manipulator so HEIC is accepted by vision models) gated on whether the session's model supports image input. In-session reuses the shared createSendComposerInputAction; at spawn it mirrors desktop doSpawn (spawn without initialMessage, then send immediate with attachments). Also wires the schema-arg controls into the new-session composer and adds the attach button + thumbnail tray + image/camera icons. Display is handled by the existing desktop chat-log WebView embed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…obile WebView

From mobile device testing: (1) wrap the embed router in WorkspaceProvider so the state-inspector view's useWorkspace resolves; (2) publish the composer inset on document.documentElement and anchor the image-preview dialog above the native composer (clears CHAT_COMPOSER_OVERLAP) so it isn't hidden under it; (3) move the timeline thumbnail's aspect-ratio onto an image-only wrapper with an absolutely-positioned img so iOS WebKit stops letting the image's intrinsic height overflow the bubble. All mobile-only / default-0 — desktop unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The low-cost, text-only title model could go conversational on image/attachment messages it can't see (e.g. "I'm sorry but no images were actually shared...") and that became the session title. Firmer system prompt (handle unseen attachments, never apologize, always output a short title) plus a guard that rejects sentence-like (>8 word) responses and falls back to the locally-derived title. Adds a regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address PR review:
- Add @electric-ax/agents to the changeset — horton.ts title-generation
  hardening is a behavioral change in a publishable package and was
  shipping with no version bump / changelog entry.
- buildInitialSpawnArgs now respects omitKeys, the one function in the
  spawn-args pipeline that didn't. Harmless today (horton's
  workingDirectory has no default) but a latent leak; threaded through
  both seeding loops, passed SCHEMA_OMIT_KEYS at the call site, and
  covered with a test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…comment

Self-review cleanup, no behavior change:
- Extract MODEL_PROVIDER_LABELS + modelProviderKey/modelOptionLabel into
  the shared schemaProperties module; desktop NewSessionView and mobile
  SchemaArgsControls now import them instead of each holding a byte-
  identical copy. Finishes the "extract rather than duplicate" approach
  the PR already applies to the classification helpers and removes the
  provider-label drift risk across the RN/DOM boundary.
- Trim the redundant "RN FormData lacks set" rationale in sendMessage so
  it lives once, at the addField branch where the append-vs-set choice is.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With schema/model/sandbox sections now stacking up, the Start button sat
at the very end of the scroll. Move it (and the spawn-error row) into a
bottom action bar pinned outside the ScrollView so it stays reachable
however many sections are open. The bar reports its height via onLayout
and the scroll content is padded by that height so every section still
scrolls clear; the bar owns its safe-area bottom inset (Screen leaves the
bottom to screen-specific controls, matching the SessionScreen composer).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@msfstef msfstef force-pushed the msfstef/enhance-spawn-args-mobile-desktop-parity branch from 80b6355 to d4bed46 Compare June 11, 2026 08:18
…titles

- finalizeSpawnArgs re-coerces numeric text the editing-time round-trip
  guard left raw (0.50, 1e3), matching desktop's number submission
- seed enum[0] only for required enums; optional enums start unset and
  get a clearable "None" item, mirroring desktop SchemaForm
- gate spawn on required string-arrays that parse to empty (separator-
  only text or []), matching desktop canSubmit
- drop the media-library permission gate before launchImageLibraryAsync;
  the system pickers don't need it and denial dead-ended the flow
- looksLikeNonTitle also rejects conversational punctuation so short
  refusals under the word cap fall back to the local title

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@msfstef msfstef merged commit 683cfae into main Jun 11, 2026
71 checks passed
@msfstef msfstef deleted the msfstef/enhance-spawn-args-mobile-desktop-parity branch June 11, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants