Skip to content

feat(pb): rewrite serverless handlers#4463

Draft
MasterPtato wants to merge 1 commit into03-23-fix_ups_implement_queue_subsfrom
03-20-feat_pb_actors_v3
Draft

feat(pb): rewrite serverless handlers#4463
MasterPtato wants to merge 1 commit into03-23-fix_ups_implement_queue_subsfrom
03-20-feat_pb_actors_v3

Conversation

@MasterPtato
Copy link
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link

railway-app bot commented Mar 21, 2026

🚅 Deployed to the rivet-pr-4463 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-inspector ❌ Build Failed (View Logs) Web Mar 25, 2026 at 1:58 am
frontend-cloud ❌ Build Failed (View Logs) Web Mar 25, 2026 at 1:58 am
website 😴 Sleeping (View Logs) Web Mar 23, 2026 at 11:35 pm
mcp-hub ✅ Success (View Logs) Web Mar 21, 2026 at 1:56 am
ladle ❌ Build Failed (View Logs) Web Mar 21, 2026 at 1:56 am

Copy link
Contributor Author

MasterPtato commented Mar 21, 2026

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 21, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4463

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4463

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4463

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4463

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4463

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4463

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4463

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4463

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4463

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4463

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4463

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4463

commit: 9d9049a

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(pb): actors v3

This is a large, ambitious draft PR introducing actor v3 infrastructure. Since it's a draft, I'm flagging issues to address before it's ready for merge.


Critical Issues (must fix before merge)

1. Package name mismatch in pegboard-envoy/Cargo.toml
The package is named pegboard-runner but lives in the pegboard-envoy directory. routing/envoy.rs references PegboardEnvoyWsCustomServe, but lib.rs defines PegboardRunnerWsCustomServe. Pick one name and be consistent — this will cause build/dependency resolution issues.

2. Literal TODO placeholders that won't compile
Several TODO tokens appear as literal struct field values that are not valid Rust:

  • ctx.msg(Allocate { TODO }) in actor2/mod.rs
  • EnvoyStartFailed { TODO: TODO } in the ActorError enum

These will fail to compile and need to be resolved before this can be tested.

3. Undefined type UpdateStateAndDbOutput
update_state_and_db in actor2/mod.rs declares Result<UpdateStateAndDbOutput> as its return type, but this struct is never defined in the diff. Either define it or change the return type.

4. Missing ActorError::EnvoyConnectionLost variant
is_envoy_failure() and error-building code reference ActorError::EnvoyConnectionLost, but it does not appear in the ActorError enum declaration. This will be a compile error.

5. Wrong struct name in runtime.rs
Envoy::new appears to return a LifecycleRunnerState { ... } literal, which looks like a stale copy-paste from the v1 runner. It should return an Envoy { ... } struct.

6. Mismatched input type: SendMessagesToEnvoyInput vs SendMessagesToRunnerInput
mod.rs calls runtime::SendMessagesToEnvoyInput, but runtime.rs defines SendMessagesToRunnerInput. Code also references input.runner_id which is not a field on that struct (only envoy_key and messages).


Dependency Issues

7. hyper pinned outside workspace in both new Cargo.toml files
Both pegboard-envoy/Cargo.toml and pegboard-gateway2/Cargo.toml pin hyper = "1.6" with a TODO comment about the workspace version mismatch. Fix the workspace dependency rather than bypassing it.


Code Quality

8. Commented-out dead code
shared_state.rs has wrapping_lt commented out at the bottom. Remove it or use it.

9. Metrics label change is a breaking change
pegboard/src/metrics.rs changes ACTOR_ALLOCATE_DURATION labels from ["status"] to ["kind", "result"]. This is a breaking change for existing dashboards/alerts monitoring this metric — worth calling out in the PR description.

10. Duplicated lost_timeout_ts computation
In the GoingAway signal handler in actor2/mod.rs, lost_timeout_ts is computed twice identically. Verify there isn't a scoping error here.


Design / Architecture Questions

11. Naming: envoy vs runner
The PR introduces pegboard-envoy and renames the protocol in v1.bare (ToServer/ToClient to ToRivet/ToEnvoy), but many internal struct and package names still say "runner". Settle on one canonical term before merging.

12. Epoxy key reservation failure modes
The new keys.rs Epoxy-based key reservation is a significant new cross-DC consensus path. A comment explaining what happens if the reservation is committed in Epoxy but the UDB reserve_actor_key activity fails afterward would help reviewers and future maintainers.

13. Unresolved TODO comments in production paths

  • // TODO: Don't return actual error? appears multiple times in ws_to_tunnel_task.rs
  • // TODO: Add queue and bg thread for processing kv ops suggests the KV path may have ordering/concurrency issues under load

These should be resolved or tracked as follow-up issues before merge.


Minor

  • pegboard/src/workflows/actor/metrics.rs: the renamed function pegboard_actor2_metrics lives in the v1 actor module. If intentional, add a comment explaining why. If not, it likely belongs in actor2.
  • The typo fix (an serverless to a serverless) in actor/runtime.rs is unrelated to the feature and could be a separate commit.

Overall this is a substantial architectural improvement with sleep/wake support, the envoy protocol rename, and Epoxy-based key reservation. The main blockers are compile errors from placeholder TODO tokens and naming inconsistencies. Happy to re-review once those are addressed.

@claude
Copy link

claude bot commented Mar 21, 2026

Design / Logic Issues

5. Double-counted eviction metrics
Both tunnel_to_ws_task and ws_to_tunnel_task subscribe to separate eviction subjects and each increments EVICTION_TOTAL. When a runner is evicted, both tasks fire concurrently, doubling the counter per eviction.

6. Silent JSON deserialization errors in conn.rs
Actor metadata falls back to an empty map on parse failure via unwrap_or_default(). This silently swallows invalid data. At minimum, log a warning when falling back.

7. Wrong field mapping in Mk1 KV metadata
In the ws_to_tunnel_task.rs Mk1 handler, create_ts: x.update_ts looks like the wrong field is being mapped. It should likely be x.create_ts.

8. Unverified abort safety in demuxer GC
In actor_event_demuxer.rs, a TODO: Verify aborting is safe here precedes channel.handle.abort(). Aborting mid-dispatch could silently drop in-flight events. Confirm safety or use a graceful shutdown channel like shutdown() uses.

@claude
Copy link

claude bot commented Mar 21, 2026

Code Quality

9. Significant KV code duplication
handle_message_mk1 and handle_message_mk2 in ws_to_tunnel_task.rs share nearly identical KV operation handling (~400 lines), repeated for each operation type. Consider extracting a shared dispatcher that converts protocol versions once and delegates to common logic.

10. Unimplemented module
engine/packages/pegboard/src/workflows/actor2/allocator.rs is an empty file. Either add a placeholder comment or remove it until it has content.

11. TODOs indicating known gaps
The queue/bg-thread TODO means KV ops run inline in the WebSocket message loop, which could block message handling under load. The do-not-return-actual-error TODO means internal error details are exposed directly to runners in KV responses.

Minor / Style

12. Workflow rename is a breaking change
In engine/packages/pegboard/src/workflows/actor/metrics.rs, renaming from pegboard_actor_metrics to pegboard_actor2_metrics affects in-flight executions. Confirm this is intentional and that any migration path has been handled.

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(pb): rewrite serverless handlers. This PR introduces three new packages (pegboard-envoy, pegboard-gateway2, pegboard-outbound) to implement serverless handler infrastructure. Critical compile errors: (1) Undefined variable eviction_topic2 in pegboard-envoy/src/lib.rs:78 - only topic and eviction_topic are defined, but %eviction_topic2 is referenced in tracing::debug!. (2) Wrong variable name init_packet in pegboard-envoy/src/conn.rs:81 - line 77 binds to init but line 81 matches on init_packet. (3) Unresolved ctx in pegboard-outbound/src/lib.rs:86 - handle() uses ctx but does not receive it as a parameter. (4) term_signal not in scope in outbound/lib.rs:319 - local to start() but used in serverless_outbound_req(). High priority: (5) Uncompilable TODO; stubs in outbound/lib.rs:144,147 - replace with bail! or unimplemented!(). (6) Unresolved task-abort safety TODO in envoy/actor_event_demuxer.rs - resolve before merging. Medium: (7) Timing-vulnerable token comparison in guard/routing/envoy.rs:109 - use constant-time comparison. (8) No length validation on namespace in envoy/utils.rs. (9) Silent metadata loss via .unwrap_or_default() in envoy/conn.rs. (10) Non-workspace hyper/hyper-util deps in pegboard-envoy/Cargo.toml. Minor: (11) Panics not propagated from join_all in outbound/lib.rs:52 - prefer JoinSet. (12) Logging convention - use structured tracing fields per CLAUDE.md.

@MasterPtato MasterPtato changed the base branch from 03-19-feat_cache_add_in_flight_deduping to graphite-base/4463 March 24, 2026 00:29
@MasterPtato MasterPtato force-pushed the 03-20-feat_pb_actors_v3 branch from 3f2996f to 9d9049a Compare March 24, 2026 00:30
@MasterPtato MasterPtato mentioned this pull request Mar 24, 2026
11 tasks
@MasterPtato MasterPtato force-pushed the 03-20-feat_pb_actors_v3 branch from 9d9049a to 7da9651 Compare March 24, 2026 00:36
@NathanFlurry NathanFlurry changed the title feat(pb): actors v3 feat(pb): rewrite serverless handlers Mar 24, 2026
@MasterPtato MasterPtato force-pushed the 03-20-feat_pb_actors_v3 branch from 7da9651 to dc6bd10 Compare March 25, 2026 00:05
@MasterPtato MasterPtato force-pushed the 03-20-feat_pb_actors_v3 branch from dc6bd10 to 8a1d9d6 Compare March 25, 2026 01:57
@MasterPtato MasterPtato changed the base branch from graphite-base/4463 to 03-23-fix_ups_implement_queue_subs March 25, 2026 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant