Skip to content

[chore] Review and cleanup fern#4239

Open
jp-agenta wants to merge 16 commits intorelease/v0.96.10from
chore/review-fern
Open

[chore] Review and cleanup fern#4239
jp-agenta wants to merge 16 commits intorelease/v0.96.10from
chore/review-fern

Conversation

@jp-agenta
Copy link
Copy Markdown
Member

@jp-agenta jp-agenta commented Apr 29, 2026

Block 1 — SDK / Client reorganization:

Summary

Reorganizes the Python SDK and introduces a standalone Fern-generated Python client.

  • Moves the Python SDK from sdk/sdks/python/ to make room for future SDKs (sdks/typescript/ placeholder added)
  • Introduces clients/python/agenta_client/ — a Fern-generated standalone package with AgentaApi, AsyncAgentaApi, AgentaApiEnvironment, and ~100+ Pydantic DTO types
  • Bridges agenta.clientagenta_client via sys.modules replacement in sdks/python/agenta/client/__init__.py so all existing imports continue to work unchanged
  • Adds clients/scripts/generate.sh to regenerate the client from a live API or OpenAPI URL
  • Excludes clients/ from ruff and pre-commit checks (Fern-generated code)
  • Adds sdks/python/oss/tests/test_fern_client.py — structure tests validating all client imports, instantiation, and sub-module presence (no API calls)
  • Updates all CI workflows (01, 11, 12, 14, 44): path triggers, working directories, pip install paths, cache keys, JUnit result paths
  • Fixes a bug in 42-railway-build.yml where the SDK path verification checked /sdk/* but the actual in-container path is /sdks/python/* — would have caused every CI build to fail that step

Block 2 — /app → /api / /web / /services rename:

Summary

Enforces correct container working directory conventions across all infrastructure. All containers were previously using /app as WORKDIR regardless of type.

Container WORKDIR
API /api
Web /web
Services /services

Files updated:

  • Dockerfiles (dev + gh, OSS + EE): api/, services/, web/WORKDIR, PYTHONPATH, COPY destinations, crontab path, entrypoint path. Dev Dockerfiles also write .pth files to site-packages for subprocess-safe Python path injection
  • Docker Compose (dev + gh, OSS + EE): volume mount targets, watchmedo reload dirs, cron command, services mount. clients/python added as bind-mount and reload-dir
  • Helm charts: _helpers.tpl (4 Alembic config path defaults), web-deployment.yaml (entrypoint), cron-deployment.yaml (crontab path)
  • Railway: all worker, cron, web, and alembic Dockerfiles; build-and-push-images.sh, deploy-from-images.sh, web/railway.json
  • run.sh: gh.local stage now copies both sdks and clients into API/services build contexts
  • api/oss/src/utils/env.py: AlembicConfig hardcoded fallback paths
  • api/oss/src/crons/queries.sh: crontab path in awk command
  • web/entrypoint.sh: ENTRYPOINT_DIR default
  • Alembic .ini files (4): script_location
  • env .example files (4): ALEMBIC_CFG_PATH_* commented defaults
  • Migration READMEs (4): example commands
  • Public docs (self-host/01-quick-start.mdx, 02-configuration.mdx, 03-upgrading.mdx): all manual migration command examples

Open in Devin Review

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Apr 30, 2026 2:52pm

Request Review

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. refactoring A code change that neither fixes a bug nor adds a feature labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Railway Preview Environment

Preview URL https://gateway-production-ecbe.up.railway.app/w
Image tag pr-4239-32aba3d
Status Failed
Railway logs Open logs
Logs View workflow run
Updated at 2026-04-29T19:25:07.092Z

Create new minimal @agenta/sdk at web/packages/agenta-sdk/ that re-exports
the Fern-generated @agenta/client and adds a Python-style init() helper.

Wire clients/typescript into web/pnpm-workspace.yaml so workspace consumers
can resolve @agenta/client via workspace protocol.

Fix Fern-generated client missing @types/node devDependency (JP flagged
"0 seconds configuring Fern" — this is the first concrete config gap).

Move v2 SDK packages (agenta-sdk, agenta-sdk-tracing, agenta-sdk-ai,
agenta-sdk-mastra) to web/_reference/ for design lookup. _reference/ sits
outside the workspace glob so nothing builds it.

Verifies the v3 plan's Option C workspace structure: TS SDK stays in
web/packages/, Fern client lives at clients/typescript/ and joins the
workspace via the relative '../clients/typescript' entry.

All builds clean: clients/typescript builds, web/packages/agenta-sdk
typecheck/lint pass, broader workspace typecheck green.
First real consumer of the v3 thin SDK wrapper. Migrate the testsets list
fetch in @agenta/entities from raw axios.post('/testsets/query') to the
Fern-typed `client.testsets.queryTestsets(...)` method.

Zod boundary validation stays — Fern's compile-time types under-declare
backend `extra="allow"` fields, so drift detection still has independent
value (per v3 plan).

SDK additions:
- getAgentaSdkClient() lazy singleton accessor mirroring the axios pattern
- init() now uses a fetch wrapper with credentials:'include' so existing
  cookie-session auth keeps working in the browser

Build fixes uncovered by this migration (real signal for Sprint 1):
1. Next.js needs @agenta/sdk + @agenta/client in transpilePackages
2. Fern client is generated as Node-only — uses fs/stream/buffer behind
   guarded dynamic imports. Browser bundle needs webpack fallback config
   in next.config.ts to stub these out (server bundle gets real Node modules)

Verified:
- pnpm install resolves cleanly across the workspace
- @agenta/sdk + @agenta/entities typecheck pass
- @agenta/sdk + @agenta/entities lint pass
- web/oss `next build` compiles successfully (2.8min); 34/34 static pages
  generated; no SDK-related errors (pre-existing testset-type errors and
  jotai deprecation warnings are unrelated baseline)
Fern client uses out-of-source build (`main: ./dist/index.js`); compiled
output should never be committed. The root .gitignore was missing this
path because the v2 workspace packages all use in-source builds
(`main: ./src/index.ts`) so dist/ patterns were never needed.
Turbopack refuses to resolve symlinks pointing outside its `root` setting.
The current root was `web/`, but pnpm symlinks @agenta/client at
web/packages/agenta-sdk/node_modules/@agenta/client → ../../../../../clients/typescript
which resolves outside `web/` (clients/typescript/ is a sibling, not a child).

Result: `next dev --turbopack` failed with "Module not found: Can't resolve
'@agenta/client'" while webpack-based `next build` worked fine (different
resolution rules).

Fix: set both turbopack.root and outputFileTracingRoot to the repo root
(__dirname/../..) so Turbopack can follow the symlink. Next.js requires
both values to match — leaving outputFileTracingRoot at web/ produced a
config-conflict warning that silently overrode turbopack.root back to web/.

Verified: `next dev --turbopack` on web/oss starts in ~1.6s with no warnings,
HTTP 308 (auth redirect) returns successfully. Same fix applied to web/ee.
JP shipped the TS Fern client with a default generators.yml ("0 seconds
configuring Fern, maybe we need to re-generate"). Adds a config: block
that addresses the bulk of the PoC's documented build/runtime issues at
the source instead of via per-app workarounds.

clients/scripts/generate.sh
  - omitFernHeaders: true                — drops X-Fern-* headers the
    Agenta API CORS allowlist doesn't accept (PoC finding #5)
  - includeCredentialsOnCrossOriginRequests: true  — withCredentials
    baked into every request so cookie-session auth works without a
    custom fetch wrapper (was finding in convenience layer)
  - fetchSupport: native, streamType: web, formDataSupport: Node18,
    fileResponseType: binary-response  — prefer browser/web standards
  - retainOriginalCasing: true           — keep wire snake_case (matches
    backend, OpenAPI spec, v2 entities Zod schemas; camelCase conversion
    would break ~all existing consumer code)
  - defaultTimeoutInSeconds: 30, maxRetries: 3  — explicit network
    defaults aligned with v2 stage-0 client
  - packageJson.browser: { fs/stream/buffer: false } + devDependencies:
    @types/node — bake the browser-stub and node types into the
    generated package.json (PoC findings #1 and #2)

clients/typescript/package.json (bootstrap heredoc): mirror the same
browser field + @types/node devDep so the static bootstrap matches.

generate.sh adds fix_typescript_admin_duplicates: a Python post-gen
patch that renames the second `createAccounts` (and its private __
partner) to `createAccountsAlt`. Two OpenAPI operations resolve to the
same TS function name — backend should disambiguate via explicit
operation_id. Admin endpoints aren't in v0 scope so the rename is safe.

Serde stays OFF: noSerdeLayer: false + allowExtraFields exposed ~200
codegen errors in fern-typescript-sdk@3.63.7 (broken
Record<string, T | null> handling, recursive type aliases for
FullJson*/LabelJson*, plus the duplicate methods). Documented in
generate.sh inline. The convenience layer's Zod boundary continues to
handle Pydantic extra="allow" at the entity level instead.

web/packages/agenta-sdk/src/index.ts: slim the fetch wrapper now that
omitFernHeaders + withCredentials are upstream. The wrapper is only
non-undefined when no apiKey was supplied — strips the empty
Authorization header HeaderAuthProvider sets in the cookie-auth case.

Verified: pnpm install clean; @agenta/sdk + @agenta/entities typecheck
+ lint pass; @agenta/client builds standalone (tsc with no errors);
next dev --turbopack on web/ee returns HTTP 200 on /w (was 500 before
the CORS fix) with no Module-not-found errors; next build on web/oss
completes successfully.
mmabrouk added a commit that referenced this pull request May 4, 2026
Combined PR (single landable unit) covering the Python SDK split. The
WORKDIR rename (`/app` -> `/api`/`/web`/`/services`) lives in a follow-up
PR stacked on top of this one. The TypeScript Fern client + thin SDK
wrapper is a separate, parallel PR.

## What changes

### Directory layout (multi-language SDK ready)
- `sdk/` -> `sdks/python/` so additional language SDKs can sit alongside
- `sdks/typescript/` placeholder for the upcoming TypeScript SDK

### Python Fern client extraction
- New standalone package at `clients/python/` containing the
  Fern-generated client (`AgentaApi`, `AsyncAgentaApi`,
  `AgentaApiEnvironment`, ~100+ Pydantic DTOs)
- Bridge shim at `sdks/python/agenta/client/__init__.py` rewrites
  `sys.modules` so existing `from agenta.client.* import ...` imports
  continue to work unchanged
- Old `sdk/agenta/client/backend/*` artifacts purged in favor of the
  standalone `clients/python` package
- `clients/scripts/generate.sh` regen entrypoint (`--openapi-url` or
  `--openapi-file`)
- `sdks/python/oss/tests/test_fern_client.py` validates imports,
  instantiation, and sub-module presence

### Containers (this PR keeps WORKDIR=/app; the rename comes next)
Dockerfiles + docker-compose + run.sh updated to:
- Source/install Python SDK from the new `./sdks/python` path
- Add `./clients/python` as a build context input and mount at
  `/app/clients/python`
- Extend `PYTHONPATH` with `/app/sdks/python:/app/clients/python`
- For dev images: write a `.pth` file so subprocess Python invocations
  pick up the new paths

### CI
GitHub workflows 01, 11, 12, 14, 42, 44 updated for new paths
(triggers, working dirs, install paths, cache keys, JUnit results).
Also fixes a pre-existing bug in `42-railway-build.yml` where the SDK
path verification checked `/sdk/*` but the in-container path is now
`/sdks/python/*`.

### Lint
- `ruff.toml` excludes `clients/` from ruff (Fern-generated)
- `.pre-commit-config.yaml` adds `^clients/` exclude on ruff hooks

### .gitignore
- `sdk/...` patterns -> `sdks/python/...`
- `api/sdk` -> `api/sdks`, `services/sdk` -> `services/sdks` (the
  temporary copies that `run.sh --local` creates)

### Design docs
- `docs/designs/fern-clients/{proposal,plan,research,gap}.md` (new)
- Reference updates in existing `docs/design/*` for the renamed script
  path

## Why one PR

The pieces above are tightly coupled at the dependency level. The
bridge shim only works once `clients/python` exists; the Dockerfile
COPY statements only succeed once both `sdks/python/` and
`clients/python/` are in the build context; the GitHub workflow path
filters only fire on the new directory paths. Splitting them produces
PRs whose CI can't pass on its own — we tried that earlier and the
intermediate stages all failed with `ModuleNotFoundError: agenta.client`
or `failed to compute cache key: "/sdks/python": not found`.

## Risks

- **Bridge shim correctness:** load-bearing for every consumer of
  `agenta.client.*`. Coverage in
  `sdks/python/oss/tests/test_fern_client.py`.
- **Type identity drift:** the new `agenta_client.*` package exposes
  Pydantic types with renamed class identities vs the old
  `agenta.client.backend.types.*`. `isinstance(...)` checks against
  the old types could break silently. Grep for
  `isinstance(.*backend\.types)` confirms no current callers in the
  repo, but worth noting.
- **Generated code excluded from lint** — by design, but ruff/prettier
  won't catch drift between hand-written and generated code at the
  package boundary.

## QA

- [ ] CI green
- [ ] `from agenta.client import AgentaApi` works (resolves through
      bridge to `agenta_client.AgentaApi`)
- [ ] `from agenta_client import AgentaApi` works (direct import)
- [ ] Run `pytest sdks/python/oss/tests/test_fern_client.py`
- [ ] Run the full SDK test suite under `sdks/python/oss/tests/`
- [ ] `docker compose -f hosting/docker-compose/oss/docker-compose.dev.yml up`
      starts cleanly; API responds on `/health`
- [ ] Regenerate the client locally:
      `bash clients/scripts/generate.sh --openapi-file ./openapi.json`
      should produce a no-op diff against committed `clients/python/`

Originally part of #4239 — split out (along with `chore/infra/workdir-rename`
and `feat/sdk/typescript-fern`) for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Combined PR (single landable unit) covering the Python SDK split. The
WORKDIR rename (`/app` -> `/api`/`/web`/`/services`) lives in a follow-up
PR stacked on top of this one. The TypeScript Fern client + thin SDK
wrapper is a separate, parallel PR.

## What changes

### Directory layout (multi-language SDK ready)
- `sdk/` -> `sdks/python/` so additional language SDKs can sit alongside
- `sdks/typescript/` placeholder for the upcoming TypeScript SDK

### Python Fern client extraction
- New standalone package at `clients/python/` containing the
  Fern-generated client (`AgentaApi`, `AsyncAgentaApi`,
  `AgentaApiEnvironment`, ~100+ Pydantic DTOs)
- Bridge shim at `sdks/python/agenta/client/__init__.py` rewrites
  `sys.modules` so existing `from agenta.client.* import ...` imports
  continue to work unchanged
- Old `sdk/agenta/client/backend/*` artifacts purged in favor of the
  standalone `clients/python` package
- `clients/scripts/generate.sh` regen entrypoint (`--openapi-url` or
  `--openapi-file`)
- `sdks/python/oss/tests/test_fern_client.py` validates imports,
  instantiation, and sub-module presence

### Containers (this PR keeps WORKDIR=/app; the rename comes next)
Dockerfiles + docker-compose + run.sh updated to:
- Source/install Python SDK from the new `./sdks/python` path
- Add `./clients/python` as a build context input and mount at
  `/app/clients/python`
- Extend `PYTHONPATH` with `/app/sdks/python:/app/clients/python`
- For dev images: write a `.pth` file so subprocess Python invocations
  pick up the new paths

### CI
GitHub workflows 01, 11, 12, 14, 42, 44 updated for new paths
(triggers, working dirs, install paths, cache keys, JUnit results).
Also fixes a pre-existing bug in `42-railway-build.yml` where the SDK
path verification checked `/sdk/*` but the in-container path is now
`/sdks/python/*`.

### Lint
- `ruff.toml` excludes `clients/` from ruff (Fern-generated)
- `.pre-commit-config.yaml` adds `^clients/` exclude on ruff hooks

### .gitignore
- `sdk/...` patterns -> `sdks/python/...`
- `api/sdk` -> `api/sdks`, `services/sdk` -> `services/sdks` (the
  temporary copies that `run.sh --local` creates)

### Design docs
- `docs/designs/fern-clients/{proposal,plan,research,gap}.md` (new)
- Reference updates in existing `docs/design/*` for the renamed script
  path

## Why one PR

The pieces above are tightly coupled at the dependency level. The
bridge shim only works once `clients/python` exists; the Dockerfile
COPY statements only succeed once both `sdks/python/` and
`clients/python/` are in the build context; the GitHub workflow path
filters only fire on the new directory paths. Splitting them produces
PRs whose CI can't pass on its own — we tried that earlier and the
intermediate stages all failed with `ModuleNotFoundError: agenta.client`
or `failed to compute cache key: "/sdks/python": not found`.

## Risks

- **Bridge shim correctness:** load-bearing for every consumer of
  `agenta.client.*`. Coverage in
  `sdks/python/oss/tests/test_fern_client.py`.
- **Type identity drift:** the new `agenta_client.*` package exposes
  Pydantic types with renamed class identities vs the old
  `agenta.client.backend.types.*`. `isinstance(...)` checks against
  the old types could break silently. Grep for
  `isinstance(.*backend\.types)` confirms no current callers in the
  repo, but worth noting.
- **Generated code excluded from lint** — by design, but ruff/prettier
  won't catch drift between hand-written and generated code at the
  package boundary.

## QA

- [ ] CI green
- [ ] `from agenta.client import AgentaApi` works (resolves through
      bridge to `agenta_client.AgentaApi`)
- [ ] `from agenta_client import AgentaApi` works (direct import)
- [ ] Run `pytest sdks/python/oss/tests/test_fern_client.py`
- [ ] Run the full SDK test suite under `sdks/python/oss/tests/`
- [ ] `docker compose -f hosting/docker-compose/oss/docker-compose.dev.yml up`
      starts cleanly; API responds on `/health`
- [ ] Regenerate the client locally:
      `bash clients/scripts/generate.sh --openapi-file ./openapi.json`
      should produce a no-op diff against committed `clients/python/`

Originally part of #4239 — split out (along with `chore/infra/workdir-rename`
and `feat/sdk/typescript-fern`) for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Combined PR (single landable unit) covering the Python SDK split. The
WORKDIR rename (`/app` -> `/api`/`/web`/`/services`) lives in a follow-up
PR stacked on top of this one. The TypeScript Fern client + thin SDK
wrapper is a separate, parallel PR.

## What changes

### Directory layout (multi-language SDK ready)
- `sdk/` -> `sdks/python/` so additional language SDKs can sit alongside
- `sdks/typescript/` placeholder for the upcoming TypeScript SDK

### Python Fern client extraction
- New standalone package at `clients/python/` containing the
  Fern-generated client (`AgentaApi`, `AsyncAgentaApi`,
  `AgentaApiEnvironment`, ~100+ Pydantic DTOs)
- Bridge shim at `sdks/python/agenta/client/__init__.py` rewrites
  `sys.modules` so existing `from agenta.client.* import ...` imports
  continue to work unchanged
- Old `sdk/agenta/client/backend/*` artifacts purged in favor of the
  standalone `clients/python` package
- `clients/scripts/generate.sh` regen entrypoint (`--openapi-url` or
  `--openapi-file`)
- `sdks/python/oss/tests/test_fern_client.py` validates imports,
  instantiation, and sub-module presence

### Containers (this PR keeps WORKDIR=/app; the rename comes next)
Dockerfiles + docker-compose + run.sh updated to:
- Source/install Python SDK from the new `./sdks/python` path
- Add `./clients/python` as a build context input and mount at
  `/app/clients/python`
- Extend `PYTHONPATH` with `/app/sdks/python:/app/clients/python`
- For dev images: write a `.pth` file so subprocess Python invocations
  pick up the new paths

### CI
GitHub workflows 01, 11, 12, 14, 42, 44 updated for new paths
(triggers, working dirs, install paths, cache keys, JUnit results).
Also fixes a pre-existing bug in `42-railway-build.yml` where the SDK
path verification checked `/sdk/*` but the in-container path is now
`/sdks/python/*`.

### Lint
- `ruff.toml` excludes `clients/` from ruff (Fern-generated)
- `.pre-commit-config.yaml` adds `^clients/` exclude on ruff hooks

### .gitignore
- `sdk/...` patterns -> `sdks/python/...`
- `api/sdk` -> `api/sdks`, `services/sdk` -> `services/sdks` (the
  temporary copies that `run.sh --local` creates)

### Design docs
- `docs/designs/fern-clients/{proposal,plan,research,gap}.md` (new)
- Reference updates in existing `docs/design/*` for the renamed script
  path

## Why one PR

The pieces above are tightly coupled at the dependency level. The
bridge shim only works once `clients/python` exists; the Dockerfile
COPY statements only succeed once both `sdks/python/` and
`clients/python/` are in the build context; the GitHub workflow path
filters only fire on the new directory paths. Splitting them produces
PRs whose CI can't pass on its own — we tried that earlier and the
intermediate stages all failed with `ModuleNotFoundError: agenta.client`
or `failed to compute cache key: "/sdks/python": not found`.

## Risks

- **Bridge shim correctness:** load-bearing for every consumer of
  `agenta.client.*`. Coverage in
  `sdks/python/oss/tests/test_fern_client.py`.
- **Type identity drift:** the new `agenta_client.*` package exposes
  Pydantic types with renamed class identities vs the old
  `agenta.client.backend.types.*`. `isinstance(...)` checks against
  the old types could break silently. Grep for
  `isinstance(.*backend\.types)` confirms no current callers in the
  repo, but worth noting.
- **Generated code excluded from lint** — by design, but ruff/prettier
  won't catch drift between hand-written and generated code at the
  package boundary.

## QA

- [ ] CI green
- [ ] `from agenta.client import AgentaApi` works (resolves through
      bridge to `agenta_client.AgentaApi`)
- [ ] `from agenta_client import AgentaApi` works (direct import)
- [ ] Run `pytest sdks/python/oss/tests/test_fern_client.py`
- [ ] Run the full SDK test suite under `sdks/python/oss/tests/`
- [ ] `docker compose -f hosting/docker-compose/oss/docker-compose.dev.yml up`
      starts cleanly; API responds on `/health`
- [ ] Regenerate the client locally:
      `bash clients/scripts/generate.sh --openapi-file ./openapi.json`
      should produce a no-op diff against committed `clients/python/`

Originally part of #4239 — split out (along with `chore/infra/workdir-rename`
and `feat/sdk/typescript-fern`) for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Mirrors the Python split (clients/python + sdks/python) on the
TypeScript side. A standalone Fern-generated TS client lives at
clients/typescript/ and joins the web/ pnpm workspace via relative
path. A thin convenience wrapper at web/packages/agenta-sdk/ re-exports
the client with a Python-style init() helper and a
getAgentaSdkClient() lazy singleton.

This PR is independent of the Python SDK reorg and the WORKDIR rename
- it touches only web/, clients/typescript/, and .gitignore.

## What changes

Adds:
- clients/typescript/ - 1144 files, the standalone Fern-generated TS
  client
- web/packages/agenta-sdk/ - 3 files, thin convenience wrapper
- web/_reference/ - 126 files, legacy v2 SDK packages (agenta-sdk,
  agenta-sdk-tracing, agenta-sdk-ai, agenta-sdk-mastra) MOVED out of
  the workspace glob so they remain on disk for design lookup but
  never build

Workspace wiring:
- web/pnpm-workspace.yaml adds ../clients/typescript
- web/pnpm-lock.yaml regenerated

Next.js config (web/{oss,ee}/next.config.ts):
- transpilePackages: + @agenta/sdk, + @agenta/client
- turbopack.root and outputFileTracingRoot lifted from web/ to repo
  root, since clients/typescript/ is a sibling of web/ (Turbopack
  refuses to follow symlinks pointing outside its root)

Build script (web/{oss,ee}/package.json):
- next build emits standalone bundle under .next/standalone/web/{oss,ee}/
  to match the new tracing root
- version field stays at main's current release version

Fern generator config (in clients/typescript/):
- omitFernHeaders: true - drops X-Fern-* headers the Agenta CORS
  allowlist rejects
- includeCredentialsOnCrossOriginRequests: true - withCredentials
  baked in, so cookie-session auth works without a custom fetch wrapper
- retainOriginalCasing: true - keep wire snake_case (matches backend,
  OpenAPI spec, and v2 entity Zod schemas)
- browser: { fs: false, stream: false, buffer: false } and
  @types/node devDep - browser-stub Node built-ins so the client
  bundles cleanly in Next.js

First consumer migration:
- web/packages/agenta-entities/src/testset/api/api.ts - fetchTestsetsList
  switches from raw axios.post('/testsets/query') to
  client.testsets.queryTestsets(...). Zod boundary stays.
- web/packages/agenta-entities/package.json adds @agenta/sdk dep

.gitignore:
- + clients/typescript/dist/

## Risks

- Largest PR in the SDK family by file count (~1144 generated TS
  files). Reviewers should focus on:
  - web/packages/agenta-sdk/src/index.ts - the wrapper layer
  - web/{oss,ee}/next.config.ts - workspace config changes
  - web/packages/agenta-entities/src/testset/api/api.ts - consumer
    migration
  - clients/typescript/package.json for the generator config
- Turbopack root change: lifting turbopack.root and
  outputFileTracingRoot to the repo root could cause unrelated files
  outside web/ to be picked up in file tracing.
- Prettier on clients/typescript/: the existing pre-commit config
  does not exclude clients/ from prettier. Modifying a TS file in
  clients/typescript/ would trigger prettier to reformat all of it.
  Mitigation: prettier hook is idempotent on Fern output.
- CORS / auth: includeCredentialsOnCrossOriginRequests means every
  Fern-client call sends cookies. Don't reuse the client across user
  sessions on the server side.
- First-consumer scope: only fetchTestsetsList migrated. Rest of the
  testset API and other consumers still use raw axios. Intentional -
  one consumer to prove wiring end-to-end.

## QA

- [ ] CI green
- [ ] pnpm install resolves cleanly across workspace
- [ ] cd web/oss && pnpm typecheck && pnpm lint pass
- [ ] cd web/oss && pnpm build succeeds; output at .next/standalone/web/oss/
- [ ] Same for web/ee
- [ ] pnpm dev --turbopack starts; testsets list page loads
- [ ] Network tab: testsets request goes to POST /testsets/query
      with Cookie: header, no X-Fern-* headers
- [ ] Regenerate TS client locally and diff against committed
      clients/typescript/src/ should be empty

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Combined PR (single landable unit) covering the Python SDK split. The
WORKDIR rename (`/app` -> `/api`/`/web`/`/services`) lives in a follow-up
PR stacked on top of this one. The TypeScript Fern client + thin SDK
wrapper is a separate, parallel PR.

## What changes

### Directory layout (multi-language SDK ready)
- `sdk/` -> `sdks/python/` so additional language SDKs can sit alongside
- `sdks/typescript/` placeholder for the upcoming TypeScript SDK

### Python Fern client extraction
- New standalone package at `clients/python/` containing the
  Fern-generated client (`AgentaApi`, `AsyncAgentaApi`,
  `AgentaApiEnvironment`, ~100+ Pydantic DTOs)
- Bridge shim at `sdks/python/agenta/client/__init__.py` rewrites
  `sys.modules` so existing `from agenta.client.* import ...` imports
  continue to work unchanged
- Old `sdk/agenta/client/backend/*` artifacts purged in favor of the
  standalone `clients/python` package
- `clients/scripts/generate.sh` regen entrypoint (`--openapi-url` or
  `--openapi-file`)
- `sdks/python/oss/tests/test_fern_client.py` validates imports,
  instantiation, and sub-module presence

### Containers (this PR keeps WORKDIR=/app; the rename comes next)
Dockerfiles + docker-compose + run.sh updated to:
- Source/install Python SDK from the new `./sdks/python` path
- Add `./clients/python` as a build context input and mount at
  `/app/clients/python`
- Extend `PYTHONPATH` with `/app/sdks/python:/app/clients/python`
- For dev images: write a `.pth` file so subprocess Python invocations
  pick up the new paths

### CI
GitHub workflows 01, 11, 12, 14, 42, 44 updated for new paths
(triggers, working dirs, install paths, cache keys, JUnit results).
Also fixes a pre-existing bug in `42-railway-build.yml` where the SDK
path verification checked `/sdk/*` but the in-container path is now
`/sdks/python/*`.

### Lint
- `ruff.toml` excludes `clients/` from ruff (Fern-generated)
- `.pre-commit-config.yaml` adds `^clients/` exclude on ruff hooks

### .gitignore
- `sdk/...` patterns -> `sdks/python/...`
- `api/sdk` -> `api/sdks`, `services/sdk` -> `services/sdks` (the
  temporary copies that `run.sh --local` creates)

### Design docs
- `docs/designs/fern-clients/{proposal,plan,research,gap}.md` (new)
- Reference updates in existing `docs/design/*` for the renamed script
  path

## Why one PR

The pieces above are tightly coupled at the dependency level. The
bridge shim only works once `clients/python` exists; the Dockerfile
COPY statements only succeed once both `sdks/python/` and
`clients/python/` are in the build context; the GitHub workflow path
filters only fire on the new directory paths. Splitting them produces
PRs whose CI can't pass on its own — we tried that earlier and the
intermediate stages all failed with `ModuleNotFoundError: agenta.client`
or `failed to compute cache key: "/sdks/python": not found`.

## Risks

- **Bridge shim correctness:** load-bearing for every consumer of
  `agenta.client.*`. Coverage in
  `sdks/python/oss/tests/test_fern_client.py`.
- **Type identity drift:** the new `agenta_client.*` package exposes
  Pydantic types with renamed class identities vs the old
  `agenta.client.backend.types.*`. `isinstance(...)` checks against
  the old types could break silently. Grep for
  `isinstance(.*backend\.types)` confirms no current callers in the
  repo, but worth noting.
- **Generated code excluded from lint** — by design, but ruff/prettier
  won't catch drift between hand-written and generated code at the
  package boundary.

## QA

- [ ] CI green
- [ ] `from agenta.client import AgentaApi` works (resolves through
      bridge to `agenta_client.AgentaApi`)
- [ ] `from agenta_client import AgentaApi` works (direct import)
- [ ] Run `pytest sdks/python/oss/tests/test_fern_client.py`
- [ ] Run the full SDK test suite under `sdks/python/oss/tests/`
- [ ] `docker compose -f hosting/docker-compose/oss/docker-compose.dev.yml up`
      starts cleanly; API responds on `/health`
- [ ] Regenerate the client locally:
      `bash clients/scripts/generate.sh --openapi-file ./openapi.json`
      should produce a no-op diff against committed `clients/python/`

Originally part of #4239 — split out (along with `chore/infra/workdir-rename`
and `feat/sdk/typescript-fern`) for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Combined PR (single landable unit) covering the Python SDK split. The
WORKDIR rename (`/app` -> `/api`/`/web`/`/services`) lives in a follow-up
PR stacked on top of this one. The TypeScript Fern client + thin SDK
wrapper is a separate, parallel PR.

## What changes

### Directory layout (multi-language SDK ready)
- `sdk/` -> `sdks/python/` so additional language SDKs can sit alongside
- `sdks/typescript/` placeholder for the upcoming TypeScript SDK

### Python Fern client extraction
- New standalone package at `clients/python/` containing the
  Fern-generated client (`AgentaApi`, `AsyncAgentaApi`,
  `AgentaApiEnvironment`, ~100+ Pydantic DTOs)
- Bridge shim at `sdks/python/agenta/client/__init__.py` rewrites
  `sys.modules` so existing `from agenta.client.* import ...` imports
  continue to work unchanged
- Old `sdk/agenta/client/backend/*` artifacts purged in favor of the
  standalone `clients/python` package
- `clients/scripts/generate.sh` regen entrypoint (`--openapi-url` or
  `--openapi-file`)
- `sdks/python/oss/tests/test_fern_client.py` validates imports,
  instantiation, and sub-module presence

### Containers (this PR keeps WORKDIR=/app; the rename comes next)
Dockerfiles + docker-compose + run.sh updated to:
- Source/install Python SDK from the new `./sdks/python` path
- Add `./clients/python` as a build context input and mount at
  `/app/clients/python`
- Extend `PYTHONPATH` with `/app/sdks/python:/app/clients/python`
- For dev images: write a `.pth` file so subprocess Python invocations
  pick up the new paths

### CI
GitHub workflows 01, 11, 12, 14, 42, 44 updated for new paths
(triggers, working dirs, install paths, cache keys, JUnit results).
Also fixes a pre-existing bug in `42-railway-build.yml` where the SDK
path verification checked `/sdk/*` but the in-container path is now
`/sdks/python/*`.

### Lint
- `ruff.toml` excludes `clients/` from ruff (Fern-generated)
- `.pre-commit-config.yaml` adds `^clients/` exclude on ruff hooks

### .gitignore
- `sdk/...` patterns -> `sdks/python/...`
- `api/sdk` -> `api/sdks`, `services/sdk` -> `services/sdks` (the
  temporary copies that `run.sh --local` creates)

### Design docs
- `docs/designs/fern-clients/{proposal,plan,research,gap}.md` (new)
- Reference updates in existing `docs/design/*` for the renamed script
  path

## Why one PR

The pieces above are tightly coupled at the dependency level. The
bridge shim only works once `clients/python` exists; the Dockerfile
COPY statements only succeed once both `sdks/python/` and
`clients/python/` are in the build context; the GitHub workflow path
filters only fire on the new directory paths. Splitting them produces
PRs whose CI can't pass on its own — we tried that earlier and the
intermediate stages all failed with `ModuleNotFoundError: agenta.client`
or `failed to compute cache key: "/sdks/python": not found`.

## Risks

- **Bridge shim correctness:** load-bearing for every consumer of
  `agenta.client.*`. Coverage in
  `sdks/python/oss/tests/test_fern_client.py`.
- **Type identity drift:** the new `agenta_client.*` package exposes
  Pydantic types with renamed class identities vs the old
  `agenta.client.backend.types.*`. `isinstance(...)` checks against
  the old types could break silently. Grep for
  `isinstance(.*backend\.types)` confirms no current callers in the
  repo, but worth noting.
- **Generated code excluded from lint** — by design, but ruff/prettier
  won't catch drift between hand-written and generated code at the
  package boundary.

## QA

- [ ] CI green
- [ ] `from agenta.client import AgentaApi` works (resolves through
      bridge to `agenta_client.AgentaApi`)
- [ ] `from agenta_client import AgentaApi` works (direct import)
- [ ] Run `pytest sdks/python/oss/tests/test_fern_client.py`
- [ ] Run the full SDK test suite under `sdks/python/oss/tests/`
- [ ] `docker compose -f hosting/docker-compose/oss/docker-compose.dev.yml up`
      starts cleanly; API responds on `/health`
- [ ] Regenerate the client locally:
      `bash clients/scripts/generate.sh --openapi-file ./openapi.json`
      should produce a no-op diff against committed `clients/python/`

Originally part of #4239 — split out (along with `chore/infra/workdir-rename`
and `feat/sdk/typescript-fern`) for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Mirrors the Python split (clients/python + sdks/python) on the
TypeScript side. A standalone Fern-generated TS client lives at
clients/typescript/ and joins the web/ pnpm workspace via relative
path. A thin convenience wrapper at web/packages/agenta-sdk/ re-exports
the client with a Python-style init() helper and a
getAgentaSdkClient() lazy singleton.

This PR is independent of the Python SDK reorg and the WORKDIR rename
- it touches only web/, clients/typescript/, and .gitignore.

## What changes

Adds:
- clients/typescript/ - 1144 files, the standalone Fern-generated TS
  client
- web/packages/agenta-sdk/ - 3 files, thin convenience wrapper
- web/_reference/ - 126 files, legacy v2 SDK packages (agenta-sdk,
  agenta-sdk-tracing, agenta-sdk-ai, agenta-sdk-mastra) MOVED out of
  the workspace glob so they remain on disk for design lookup but
  never build

Workspace wiring:
- web/pnpm-workspace.yaml adds ../clients/typescript
- web/pnpm-lock.yaml regenerated

Next.js config (web/{oss,ee}/next.config.ts):
- transpilePackages: + @agenta/sdk, + @agenta/client
- turbopack.root and outputFileTracingRoot lifted from web/ to repo
  root, since clients/typescript/ is a sibling of web/ (Turbopack
  refuses to follow symlinks pointing outside its root)

Build script (web/{oss,ee}/package.json):
- next build emits standalone bundle under .next/standalone/web/{oss,ee}/
  to match the new tracing root
- version field stays at main's current release version

Fern generator config (in clients/typescript/):
- omitFernHeaders: true - drops X-Fern-* headers the Agenta CORS
  allowlist rejects
- includeCredentialsOnCrossOriginRequests: true - withCredentials
  baked in, so cookie-session auth works without a custom fetch wrapper
- retainOriginalCasing: true - keep wire snake_case (matches backend,
  OpenAPI spec, and v2 entity Zod schemas)
- browser: { fs: false, stream: false, buffer: false } and
  @types/node devDep - browser-stub Node built-ins so the client
  bundles cleanly in Next.js

First consumer migration:
- web/packages/agenta-entities/src/testset/api/api.ts - fetchTestsetsList
  switches from raw axios.post('/testsets/query') to
  client.testsets.queryTestsets(...). Zod boundary stays.
- web/packages/agenta-entities/package.json adds @agenta/sdk dep

.gitignore:
- + clients/typescript/dist/

## Risks

- Largest PR in the SDK family by file count (~1144 generated TS
  files). Reviewers should focus on:
  - web/packages/agenta-sdk/src/index.ts - the wrapper layer
  - web/{oss,ee}/next.config.ts - workspace config changes
  - web/packages/agenta-entities/src/testset/api/api.ts - consumer
    migration
  - clients/typescript/package.json for the generator config
- Turbopack root change: lifting turbopack.root and
  outputFileTracingRoot to the repo root could cause unrelated files
  outside web/ to be picked up in file tracing.
- Prettier on clients/typescript/: the existing pre-commit config
  does not exclude clients/ from prettier. Modifying a TS file in
  clients/typescript/ would trigger prettier to reformat all of it.
  Mitigation: prettier hook is idempotent on Fern output.
- CORS / auth: includeCredentialsOnCrossOriginRequests means every
  Fern-client call sends cookies. Don't reuse the client across user
  sessions on the server side.
- First-consumer scope: only fetchTestsetsList migrated. Rest of the
  testset API and other consumers still use raw axios. Intentional -
  one consumer to prove wiring end-to-end.

## QA

- [ ] CI green
- [ ] pnpm install resolves cleanly across workspace
- [ ] cd web/oss && pnpm typecheck && pnpm lint pass
- [ ] cd web/oss && pnpm build succeeds; output at .next/standalone/web/oss/
- [ ] Same for web/ee
- [ ] pnpm dev --turbopack starts; testsets list page loads
- [ ] Network tab: testsets request goes to POST /testsets/query
      with Cookie: header, no X-Fern-* headers
- [ ] Regenerate TS client locally and diff against committed
      clients/typescript/src/ should be empty

Originally part of #4239 - split out for independent review.
mmabrouk added a commit that referenced this pull request May 4, 2026
Containers were uniformly using `/app` as `WORKDIR` regardless of which
service they hosted, which made multi-service compose stacks ambiguous
and produced confusing log paths. This aligns container `WORKDIR` with
the service that owns it:

| Service                       | WORKDIR     |
|-------------------------------|-------------|
| `api/oss`, `api/ee`           | `/api`      |
| `web/oss`, `web/ee`           | `/web`      |
| `services/oss`, `services/ee` | `/services` |

The shared SDK and client install locations also move out from under
the per-container `WORKDIR` and become top-level paths in every image:

- `/app/sdks/python` -> `/sdks/python`
- `/app/clients/python` -> `/clients/python`

This is stacked on `chore/sdk/python-reorg`. That PR introduces
`sdks/python/` and `clients/python/` as build-context paths and adds
them to `PYTHONPATH`. Without that prerequisite, Docker builds would
fail with `failed to compute cache key: "/sdks/python": not found`
because the source directories don't exist on `main` yet.

## What changes

- 12 Dockerfiles (api/{oss,ee}, web/{oss,ee}, services/{oss,ee} -
  dev + gh)
- 7 docker-compose files (mount targets, watchmedo dirs, cron command)
- Helm: _helpers.tpl alembic defaults, web-deployment, cron-deployment
- Railway: all worker/cron/web/alembic Dockerfiles + scripts +
  railway.json
- 4 alembic.ini script_location, 4 migration READMEs
- 4 env.example files (ALEMBIC_CFG_PATH_* defaults)
- 3 self-host docs (docker exec command examples)
- api/oss/src/utils/env.py (alembic fallback paths)
- api/oss/src/crons/queries.sh (crontab path)
- web/entrypoint.sh (ENTRYPOINT_DIR default)

## Risks

- Largest blast radius: every container in the stack has its WORKDIR
  changed. Any custom Dockerfile or compose override held outside this
  repo will break.
- Helm values overrides: if any deployment overrides alembicCfgPath
  in their values.yaml, those overrides reference the old /app/... path
  and need to be updated.
- Cron command in compose gh.yml: pinned older AGENTA_API_IMAGE_TAG
  values still emit /app/crontab and would fail to start. Operators
  must bump the image tag alongside this compose update.
- No application logic changes. Only paths in Docker, compose, helm,
  alembic, env defaults move.

## QA

- [ ] CI green (now that /sdks/python and /clients/python exist via
      the parent PR)
- [ ] docker compose dev stack (oss + ee) starts cleanly
- [ ] helm template shows the new /api path in env vars
- [ ] Run a test alembic migration in a dev container
- [ ] Cron container: queries.sh runs from /api/
- [ ] Self-host docs render correctly

Originally part of #4239 - split out for independent review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring A code change that neither fixes a bug nor adds a feature size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants